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Abstract.  The  framework  of  algorithmic  knowledge  assumes  that  agents  use  determin¬ 
istic  knowledge  algorithms  to  compute  the  facts  they  explicitly  know.  We  extend  the 
framework  to  allow  for  randomized  knowledge  algorithms.  We  then  characterize  the  in¬ 
formation  provided  by  a  randomized  knowledge  algorithm  when  its  answers  have  some 
probability  of  being  incorrect.  We  formalize  this  information  in  terms  of  evidence ;  a  ran¬ 
domized  knowledge  algorithm  returning  “Yes”  to  a  query  about  a  fact  tp  provides  evidence 
for  ip  being  true.  Finally,  we  discuss  the  extent  to  which  this  evidence  can  be  used  as  a 
basis  for  decisions. 


1.  Introduction 

Under  the  standard  possible-worlds  interpretation  of  knowledge,  which  goes  back  to 
Hintikka  rmrei.  an  agent  knows  ip  if  ip  is  true  at  all  the  worlds  the  agent  considers  possi¬ 
ble.  This  interpretation  of  knowledge  has  been  found  useful  in  capturing  some  important 
intuitions  in  the  analysis  of  distributed  protocols  [Fagin,  Halpern,  Moses,  and  Vardi  1995  . 
However,  its  usefulness  is  somewhat  limited  by  what  Hintikka  )1962|  called  the  logical  om¬ 
niscience  problem:  agents  know  all  tautologies  and  know  all  logical  consequences  of  their 
knowledge.  Many  approaches  have  been  developed  to  deal  with  the  logical  omniscience 
problem  (see  |Fagin,  Halpern,  Moses,  and  Vardi  19951  Chapter  10  and  11]  for  a  discussion 
and  survey).  We  focus  on  one  approach  here  that  has  been  called  algorithmic  knowledge 
[Halpern,  Moses,  and  Vardi  1994|.  The  idea  is  simply  to  assume  that  agents  are  equipped 
with  “knowledge  algorithms”  that  they  use  to  compute  what  they  know.  An  agent  algo¬ 
rithmically  knows  ip  if  his  knowledge  algorithm  says  “Yes”  when  asked  ip.1 

Algorithmic  knowledge  is  a  very  general  approach.  For  example,  Berman,  Garay,  and 
Perry  [19891  implicitly  use  a  particular  form  of  algorithmic  knowledge  in  their  analysis 
of  Byzantine  agreement.  Roughly  speaking  they  allow  agents  to  perform  limited  tests 
based  on  the  information  they  have;  agents  know  only  what  follows  from  these  limited 
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tests.  Ramanujam  [T999J  investigates  a  particular  form  of  algorithmic  knowledge,  where 
the  knowledge  algorithm  is  essentially  a  model-checking  procedure  for  a  standard  logic  of 
knowledge.  More  specifically,  Ramanujam  considers,  at  every  state,  the  part  of  the  model 
that  a  particular  agent  sees  (for  instance,  an  agent  in  a  distributed  system  may  be  aware 
only  of  its  immediate  neighbors,  the  ones  with  whom  he  can  communicate)  and  takes  as 
knowledge  algorithm  the  model-checking  procedure  for  epistemic  logic,  applied  to  the  sub¬ 
model  generated  by  the  visible  states.  Halpern  and  Pucella  Enrol  have  applied  algorithmic 
knowledge  to  security  to  capture  adversaries  who  are  resource  bounded  (and  thus,  for  ex¬ 
ample,  cannot  factor  the  products  of  large  primes  that  arise  in  the  RSA  cryptosystem 
[Rivest,  Shamir,  and  Adelman  1978]). 

All  these  examples  use  sound  knowledge  algorithms:  although  the  algorithm  may  not 
give  an  answer  under  all  circumstances,  when  it  says  “Yes”  on  input  ip,  the  agent  really 
does  know  ip  in  the  standard  possible-worlds  sense.  Although  soundness  is  not  required  in 
the  basic  definition,  it  does  seem  to  be  useful  in  many  applications. 

Our  interest  in  this  paper  is  knowledge  algorithms  that  may  use  some  randomization. 
As  we  shall  see,  there  are  numerous  examples  of  natural  randomized  knowledge  algorithms. 
With  randomization,  whether  or  not  the  knowledge  algorithm  says  “Yes”  may  depend  on  the 
outcome  of  coin  tosses.  This  poses  a  slight  difficulty  in  even  giving  semantics  to  algorithmic 
knowledge,  since  the  standard  semantics  makes  sense  only  for  deterministic  algorithms.  To 
deal  with  this  problem,  we  make  the  algorithms  deterministic  by  supplying  them  an  extra 
argument  (intuitively,  the  outcome  of  a  sequence  of  coin  tosses)  to  “derandomize”  them. 
We  show  that  this  approach  provides  a  natural  extension  of  the  deterministic  case. 

To  motivate  the  use  of  randomized  knowledge  algorithms,  we  consider  a  security  ex¬ 
ample  from  Halpern  and  Pucella  jHE].  The  framework  in  that  paper  lets  us  reason  about 
principals  communicating  in  the  presence  of  adversaries,  using  cryptographic  protocols. 
The  typical  assumption  made  when  analyzing  security  protocols  is  that  adversaries  can 
intercept  all  the  messages  exchanged  by  the  principals,  but  cannot  necessarily  decrypt  en¬ 
crypted  messages  unless  they  have  the  appropriate  decryption  key.  To  capture  precisely  the 
capabilities  of  adversaries,  we  use  knowledge  algorithms.  Roughly  speaking,  a  knowledge 
algorithm  for  an  adversary  will  specify  what  information  the  adversary  can  extract  from  in¬ 
tercepted  messages.  In  this  paper,  we  consider  an  adversary  that  further  attempts  to  guess 
the  cryptographic  keys  used  by  the  principals  in  the  protocol.  We  show  how  to  capture  the 
knowledge  of  such  an  adversary  using  a  randomized  knowledge  algorithm. 

Having  defined  the  framework,  we  try  to  characterize  the  information  obtained  by 
getting  a  “Yes”  answer  to  a  query  for  ip.  If  the  knowledge  algorithm  is  sound,  then  a  “Yes” 
answer  guarantees  that  ip  is  true.  However,  the  randomized  algorithms  of  most  interest 
to  us  give  wrong  answers  with  positive  probability,  so  are  not  sound.  Nevertheless,  it 
certainly  seems  that  if  the  probability  that  the  algorithm  gives  the  wrong  answer  is  low,  it 
provides  very  useful  information  when  it  says  “Yes”  to  a  query  ip.  This  intuition  already 
appears  in  the  randomized  algorithms  literature,  where  a  “Yes”  answer  from  a  highly  reliable 
randomized  algorithm  (i.e.,  one  with  a  low  probability  of  being  wrong)  is  deemed  “good 
enough”.  In  what  sense  is  this  true?  One  contribution  of  our  work  is  to  provide  a  formal 
answer  to  that  question.  It  may  seem  that  a  “Yes”  answer  to  a  query  ip  from  a  highly 
reliable  randomized  knowledge  algorithm  should  make  the  probability  that  ip  is  true  be 
high  but,  as  we  show,  this  is  not  necessarily  true.  Rather,  the  information  should  be  viewed 
as  evidence  that  ip  is  true;  the  probability  that  ip  is  true  also  depends  in  part  on  the  prior 
probability  of  ip. 
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Evidence  has  been  widely  studied  in  the  literature  on  inductive  logic  [Kyburg  19831  • 
We  focus  on  the  evidence  contributed  specifically  by  a  randomized  knowledge  algorithm. 
In  a  companion  paper  [Halpern  and  Pucella  2003  ,  we  consider  a  formal  logic  for  reasoning 
about  evidence. 

The  rest  of  this  paper  is  organized  as  follows.  In  Section^  we  review  algorithmic  knowl¬ 
edge  (under  the  assumption  that  knowledge  algorithms  are  deterministic).  In  Section 0  we 
give  semantics  to  algorithmic  knowledge  in  the  presence  of  randomized  knowledge  algo¬ 
rithms.  In  Section^  we  show  how  the  definition  works  in  the  context  of  an  example  from 
the  security  domain.  In  Section[S]we  characterize  the  information  provided  by  a  randomized 
knowledge  algorithm  in  terms  of  evidence.  We  conclude  in  Section  [HI  All  proofs  are  deferred 
to  the  appendix. 


2.  Reasoning  about  Knowledge  and  Algorithmic  Knowledge 


The  aim  is  to  be  able  to  reason  about  properties  of  systems  involving  the  knowledge 
of  agents  in  the  system.  To  formalize  this  type  of  reasoning,  we  first  need  a  language.  The 
syntax  for  a  multiagent  logic  of  knowledge  is  straightforward.  Starting  with  a  set  <h  of 
primitive  propositions,  which  we  can  think  of  as  describing  basic  facts  about  the  system, 
such  as  “the  door  is  closed”  or  “agent  A  sent  the  message  m  to  Bv ,  more  complicated 
formulas  are  formed  by  closing  off  under  negation,  conjunction,  and  the  modal  operators 
K i,  . . .,  Kn  and  X\, . . . ,  Xn .  Thus,  if  p  and  ?/>  are  formulas,  then  so  are  —up,  p  A  i/),  Ktp 
(read  “agent  i  knows  p"),  and  Xi<p  (read  “agent  i  can  compute  p").  As  usual,  we  take 
p  V  i/j  to  be  an  abbreviation  for  ~'[~'P  A  —iijj)  and  p  =>  ip  to  be  an  abbreviation  for  —>p  V  t)). 

The  standard  possible-worlds  semantics  for  knowledge  uses  Kripke  structures  Kripke_1963  | 
Formally,  a  Kripke  structure  is  composed  of  a  set  S  of  states  or  possible  worlds,  an  inter¬ 
pretation  7 r  which  associates  with  each  state  in  S  a  truth  assignment  to  the  primitive 
propositions  (i.e. ,  n(s)(p)  G  {true,  false}  for  each  state  s  G  S  and  each  primitive  proposi¬ 
tion  p),  and  equivalence  relations  on  S  (recall  that  an  equivalence  relation  is  a  binary 
relation  which  is  reflexive,  symmetric,  and  transitive).  The  relation  is  agent  i’s  possibil¬ 
ity  relation.  Intuitively,  s  t  if  agent  i  cannot  distinguish  state  s  from  state  t  (so  that  if  s 
is  the  actual  state  of  the  world,  agent  i  would  consider  t  a  possible  state  of  the  world).  For 
our  purposes,  the  equivalence  relations  are  obtained  by  taking  a  set  C  of  local  states ,  and 
giving  each  agent  a  view  of  the  state,  that  is,  a  function  L*  :  S  — >  £.  We  define  s  t  if  and 
only  if  Li(s)  =  Lj(f).  In  other  words,  agent  i  considers  the  states  s  and  t  indistinguishable 
if  he  has  the  same  local  state  at  both  states. 

To  interpret  explicit  knowledge  of  the  form  Xip,  we  assign  to  each  agent  a  knowledge 
algorithm  that  the  agent  can  use  to  determine  whether  he  knows  a  particular  formula.  A 
knowledge  algorithm  A  takes  as  inputs  a  formula  of  the  logic,  a  local  state  t  in  C ,  as  well 
as  the  state  as  a  whole.  This  is  a  generalization  of  the  original  presentation  of  algorithmic 
knowledge  [Halpern,  Moses,  and  Vardi  1994 ,  in  which  the  knowledge  algorithms  did  not 


take  the  state  as  input.  The  added  generality  is  necessary  to  model  knowledge  algorithms 
that  query  the  state — for  example,  a  knowledge  algorithm  might  use  a  sensor  to  determine 
the  distance  between  a  robot  and  a  wall  (see  Section 0.  Knowledge  algorithms  are  required 
to  be  deterministic  and  terminate  on  all  inputs,  with  result  “Yes”,  “No”,  or  “?”.  A  knowl¬ 
edge  algorithm  says  “Yes”  to  a  formula  p  (in  a  given  state)  if  the  algorithm  determines  that 
the  agent  knows  p  at  the  state,  “No”  if  the  algorithm  determines  that  the  agent  does  not 
know  p  at  the  state,  and  “?”  if  the  algorithm  cannot  determine  whether  the  agent  knows 

p. 


4 


J.  Y.  HALPERN  AND  R.  PUCELLA 


An  algorithmic  knowledge  structure  M  is  a  tuple  (S,  n,  Li,  . . . ,  Ln,  Ai, . . . ,  An),  where 
L\,...,Ln  are  the  view  functions  on  the  states,  and  Ai, . . . ,  An  are  knowledge  algorithms.2 

We  define  what  it  means  for  a  formula  p  to  be  true  (or  satisfied)  at  a  state  s  in  an 
algorithmic  knowledge  structure  M,  written  (M,  s)  |=  p,  inductively  as  follows: 

(M,  s)  |=  p  if  7 t(s)(p)  =  true 

(M,  s)  1=  ->p  if  (M,  s)  Y=  p 

(M,  s)  |=  p  A  ip  if  (M,  s)  |=  p  and  (M,  s)  |=  ip 

(M,  s )  |=  Kip  if  (M,  t)  |=  </?  for  all  t  with  s  t 

(M,s)\=Xip  if  A i(p,Li(s),8)  =  “Yes”. 

The  first  clause  shows  how  we  use  the  7r  to  define  the  semantics  of  the  primitive  propositions. 
The  next  two  clauses,  which  define  the  semantics  of  ->  and  A,  are  the  standard  clauses  from 
propositional  logic.  The  fourth  clause  is  designed  to  capture  the  intuition  that  agent  i  knows 
p  exactly  if  p  is  true  in  all  the  states  that  i  considers  possible.  The  final  clause  interprets 
X{,p  via  agent  i’s  knowledge  algorithm.  Thus,  agent  i  has  algorithmic  knowledge  of  p  at 
a  given  state  if  the  agent’s  algorithm  outputs  “Yes”  when  presented  with  p,  the  agent’s 
local  state,  and  the  state.  (Both  the  outputs  “No”  and  “?”  result  in  lack  of  algorithmic 
knowledge.)  As  usual,  we  say  that  a  formula  p  is  valid  in  structure  M  and  write  M  \=  p  if 
(M,  s)  |=  p  for  all  states  s  G  S'  p  is  valid  if  it  is  valid  in  all  structures. 

We  can  think  of  Kj  as  representing  implicit  knowledge ,  facts  that  the  agent  implicitly 
knows,  given  its  information.  One  can  check  that  implicit  knowledge  is  closed  under  im¬ 
plication,  that  is,  Kip  A  Kt(p  =5-  ip)  =>■  Kiip  is  valid,  and  that  an  agent  implicitly  knows 
all  valid  formulas,  so  that  if  p  is  valid,  then  K^p  is  valid.  These  properties  say  that  agents 
are  very  powerful  reasoners.  What  is  worse,  while  it  is  possible  to  change  some  properties 
of  knowledge  by  changing  the  properties  of  the  relation  ~j,  no  matter  how  we  change  it, 
we  still  get  closure  under  implication  and  knowledge  of  valid  formulas  as  properties.  They 
seem  to  be  inescapable  features  of  the  possible-worlds  approach.  This  suggests  that  the 
possible- worlds  approach  is  appropriate  only  for  “ideal  knowers”,  ones  that  know  all  valid 
formulas  as  well  as  all  logical  consequences  of  their  knowledge,  and  thus  inappropriate  for 
reasoning  about  agents  that  are  computationally  limited.  In  contrast,  X,  represents  explicit 
knowledge,  facts  whose  truth  the  agent  can  compute  explicitly.  Since  we  put  no  a  priori 
restrictions  on  the  knowledge  algorithms,  an  agent  can  explicitly  know  both  p  and  p  =>-  ip 
without  explicitly  knowing  ip,  for  example. 

As  defined,  there  is  no  necessary  connection  between  Xip  and  Kjp.  An  algorithm 
could  very  well  claim  that  agent  i  knows  p  (i.e. ,  output  “Yes”)  whenever  it  chooses  to, 
including  at  states  where  K^p  does  not  hold.  Although  algorithms  that  make  mistakes  are 
common,  we  are  often  interested  in  knowledge  algorithms  that  are  correct.  We  say  that  a 
knowledge  algorithm  is  sound  for  agent  i  in  the  structure  M  if  for  all  states  s  of  M  and 
formulas  p,  A j(p,  Li(s),  s)  =  “Yes”  implies  ( M,s )  \=  K^p,  and  A i(p,  Li(s),  s)  =  “No”  implies 
(M,  s)  |=  —iKip.  Thus,  a  knowledge  algorithm  is  sound  if  its  definite  answers  are  correct.  If 
we  restrict  attention  to  sound  algorithms,  then  algorithmic  knowledge  can  be  viewed  as  an 
instance  of  awareness,  as  defined  by  Fagin  and  Halpern  mm- 

2Halpern,  Moses,  and  Vardi  introduced  algorithmic  knowledge  in  the  context  of  dynamic  systems, 

that  is,  systems  evolving  in  time.  The  knowledge  algorithm  is  allowed  to  change  at  every  state  of  the  system. 
Since  the  issues  that  interest  us  do  not  involve  time,  we  do  not  consider  dynamic  systems  in  this  paper. 
We  remark  that  what  we  are  calling  “algorithmic  knowledge  structures”  here  are  called  “algorithmic  struc¬ 
tures”  in  [Fagin,  Halpern,  Moses,  and  Vardi  1995|  (Halpern,  Moses,  and  Vardi  1994|.  The  term  “algorithmic 
knowledge  structures”  is  used  in  the  paperback  edition  of  [Fagin,  H  alpem^^osesTand  Vardi  1995|. 
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There  is  a  subtlety  here,  due  to  the  asymmetry  in  the  handling  of  the  answers  returned 
by  knowledge  algorithms.  The  logic  does  not  let  us  distinguish  between  a  knowledge  algo¬ 
rithm  returning  “No”  and  a  knowledge  algorithm  returning  “?” ;  they  both  result  in  lack  of 
algorithmic  knowledge.3 4 *  In  section  FT7!!  where  we  define  the  notion  of  a  reliable  knowledge 
algorithm,  reliability  will  be  characterized  in  terms  of  algorithmic  knowledge,  and  thus  the 
definition  will  not  distinguish  between  a  knowledge  algorithm  returning  “No”  or  “?”.  Thus, 
in  that  section,  for  simplicity,  we  consider  algorithms  that  are  complete,  in  the  sense  that 
they  always  return  either  “Yes”  or  “No”,  and  not  “?”.  More  precisely,  for  a  formula  tp, 
define  a  knowledge  algorithm  A to  be  p>- complete  for  agent  i  in  the  structure  M  if  for  all 
states  s  of  M,  A ;(<£>, Lj(s),  s)  £  {“Yes”,  “No”}. 

3.  Randomized  Knowledge  Algorithms 

Randomized  knowledge  algorithms  arise  frequently  in  the  literature  (although  they  have 
typically  not  been  viewed  as  knowledge  algorithms).  In  order  to  deal  with  randomized  algo¬ 
rithms  in  our  framework,  we  need  to  address  a  technical  question.  Randomized  algorithms 
are  possibly  nondeterministic;  they  may  not  yield  the  same  result  on  every  invocation  with 
the  same  arguments.  Since  Xip>  holds  at  a  state  s  if  the  knowledge  algorithm  answers  “Yes” 
at  that  state,  this  means  that,  with  the  semantics  of  the  previous  section,  Xip>  would  not  be 
well  defined.  Whether  it  holds  at  a  given  state  depends  on  the  outcome  of  random  choices 
made  by  the  algorithm.  However,  we  expect  the  semantics  to  unambiguously  declare  a 
formula  either  true  or  false. 

Before  we  describe  our  solution  to  the  problem,  we  discuss  another  potential  solution, 
which  is  to  define  the  satisfaction  relation  probabilistically.  That  is,  rather  than  associating 
a  truth  value  with  each  formula  at  each  state,  we  associate  a  probability  Prs(<£>)  with  each 
formula  <p  at  each  state  s.  The  standard  semantics  can  be  viewed  as  a  special  case  of  this 
semantics,  where  the  probabilities  are  always  either  0  or  1.  Under  this  approach,  it  seems 
reasonable  to  take  Prs(p)  to  be  either  0  or  1,  depending  on  whether  primitive  proposition  p 
is  true  at  state  s,  and  to  take  Pr s(Xi<p)  to  be  the  probability  that  i’s  knowledge  algorithm 
returns  “Yes”  given  inputs  (p,  Lj(s),  and  s.  However,  it  is  not  then  clear  how  to  define 
Prs(</9  A  ip).  Taking  it  to  be  Prs(y?)Prs(?/;)  implicitly  treats  (p  and  ip  as  independent,  which 
is  clearly  inappropriate  if  ip  is  —up.  Even  ignoring  this  problem,  it  is  not  clear  how  to 
define  Pi's(Xtip  A  Xiip),  since  again  there  might  be  correlations  between  the  output  of  the 
knowledge  algorithm  on  input  (<p,  Li(s),s)  and  input  (ip ,  Li(s) ,  s) . 

We  do  not  use  probabilistic  truth  values  in  this  paper.  Instead,  we  deal  with  the  prob¬ 
lem  by  adding  information  to  the  semantic  model  to  resolve  the  uncertainty  about  the  truth 
value  of  formulas  of  the  form  Xnp.  Observe  that  if  the  knowledge  algorithm  A  is  random¬ 
ized,  then  the  answer  that  A  gives  on  input  (cp,£,s)  will  depend  on  the  outcome  of  coin 
tosses  (or  whatever  other  randomizing  device  is  used  by  A).  We  thus  turn  the  randomized 
algorithm  into  a  deterministic  algorithm  by  supplying  it  with  an  appropriate  argument.  For 
example,  we  supply  an  algorithm  that  makes  random  choices  by  tossing  coins  a  sequence 
of  outcomes  of  coin  tosses.  We  can  now  interpret  a  knowledge  algorithm  answering  “Yes” 
with  probability  a  at  a  state  by  considering  the  probability  of  those  sequences  of  coin  tosses 
at  the  state  that  make  the  algorithm  answer  “Yes”. 

3There  may  be  reasons  to  distinguish  “No”  from  and  it  is  certainly  possible  to  extend  the  logic  to 
distinguish  them. 

4To  get  around  this  particular  problem,  some  approaches  that  combine  logic  and  probability  give  seman¬ 

tics  to  formulas  by  viewing  them  as  random  variables  (e.g.,  IKozen  198511. 
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Formally,  we  start  with  (possibly  randomized)  knowledge  algorithms  Ai,...,An.  For 
simplicity,  assume  that  the  randomness  in  the  knowledge  algorithms  comes  from  tossing 
coins.  A  derandomizer  is  a  tuple  v  =  (v±, . . .  ,vn)  such  that  for  every  agent  i.  Vi  is  a 
sequence  of  outcomes  of  coin  tosses  (heads  and  tails).  There  is  a  separate  sequence  of  coin 
tosses  for  each  agent  rather  than  just  a  single  sequence  of  coin  tosses,  since  we  do  not  want 
to  assume  that  all  agents  use  the  same  coin.  Let  V  be  the  set  of  all  such  derandomizers. 
To  every  randomized  algorithm  A  we  associate  a  derandomized  algorithm  kd  which  takes 
as  input  not  just  the  query  tp,  local  state  £,  and  state  s,  but  also  the  sequence  V{  of  V s 
coin  tosses,  taken  from  a  derandomizer  (ui, . . .  ,vn).  A  probabilistic  algorithmic  knowledge 
structure  is  a  tuple  N  =  (S,  vr,  L\ , . . . ,  Ln,  kf, . . . ,  A^,  u),  where  v  is  a  probability  distribution 
on  V  and  kf  is  the  derandomized  version  of  A, .  (Note  that  in  a  probabilistic  algorithmic 
knowledge  structure  the  knowledge  algorithms  are  in  fact  deterministic.) 

The  only  assumption  we  make  about  the  distribution  v  is  that  it  does  not  assign  zero 
probability  to  the  nonempty  sets  of  sequences  of  coin  tosses  that  determine  the  result  of  the 
knowledge  algorithm.  More  precisely,  we  assume  that  for  all  agents  i,  formulas  p,  and  states 
s,  {v  |  A f(ip,Li(s),s,Vi)  =  “Yes”}  ^  0  if  and  only  if  v({v  \  kf(p,Li(s),s,Vi)  =  “Yes”})  >  0, 
and  similarly  for  “No”  and  “?”  answers.  Note  that  this  property  is  satisfied,  for  instance,  if 
v  assigns  nonzero  probability  to  every  sequence  of  coin  tosses.  We  do  not  impose  any  other 
restrictions  on  u.  In  particular,  we  do  not  require  that  the  coin  be  fair  or  that  the  tosses 
be  independent.  Of  course,  we  can  capture  correlation  between  the  agents’  coins  by  using 
an  appropriate  distribution  u. 

The  truth  of  a  formula  is  now  determined  relative  to  a  pair  (s,v)  consisting  of  a  state 
s  and  a  derandomizer  v.  We  abuse  notation  and  continue  to  call  these  pairs  states.  The 
semantics  of  formulas  in  a  probabilistic  algorithmic  knowledge  structure  is  a  straightfor¬ 
ward  extension  of  their  semantics  in  algorithmic  knowledge  structures.  The  semantics  of 
primitive  propositions  is  given  by  7r;  conjunctions  and  negations  are  interpreted  as  usual; 
for  knowledge  and  algorithmic  knowledge,  we  have 

(IV,  s,  v )  (=  Kip  if  ( N ,  t,  v')  |=  p  for  all  v'  £  V  and  all  t  &  S  such  that  s  1 
(N,s,v)  |=  Xpp  if  kf(ip,Li(s),s,Vi )  =  “Yes”,  where  v  =  (vi, . . .  ,vn). 

Here,  kf  gets  vt  as  part  of  its  input.  kf(p,  Li(s),  s,Vi )  is  interpreted  as  the  output  of  kf 
given  that  vt  describes  the  outcomes  of  the  coin  tosses.  It  is  perhaps  best  to  interpret 
(M,  s,v)  |=  Xpp  as  saying  that  agent  i’s  knowledge  algorithm  would  say  “Yes”  if  it  were  run 
in  state  s  with  derandomizer  Uj.  The  semantics  for  knowledge  then  enforces  the  intuition 
that  the  agent  knows  neither  the  state  nor  the  derandomizer  used.* * 5 

Having  the  sequence  of  coin  tosses  as  part  of  the  input  allows  us  to  talk  about  the 
probability  that  V s  algorithm  answers  yes  to  the  query  p  at  a  state  s.  It  is  simply  v({v  \ 
kd(p,  Li(s),  s,Vi)  =  “Yes”}).  To  capture  this  in  the  language,  we  extend  the  language  to 
allow  formulas  of  the  form  Pr(</?)  >  a,  read  “the  probability  of  <p  is  at  least  a”.6  The 
semantics  of  such  formulas  is  straightforward: 

(. N,s,v )  1=  Pr(y?)  >  a  if  is({v'  \  ( N,s,v ')  |=  ip})  >  a. 

JA  reviewer  of  the  paper  suggested  that  we  instead  should  define  ( M,s,v )  |=  Kup  if  ( M,t,v )  |=  i p  for  all 

t  G  5  such  that  s  t.  This  would  be  appropriate  if  the  agent  knew  the  derandomizer  being  used. 

5We  allow  a  to  be  an  arbitrary  real  number  here.  If  we  were  concerned  with  complexity  results  and 
having  a  finitary  language,  it  would  make  sense  to  restrict  a  to  being  rational,  as  is  done,  for  example,  in 

[Fagin,  Halpern,  and  Megiddo  19901-  None  of  our  results  would  be  affected  if  we  restrict  a  in  this  way. 
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Note  that  the  truth  of  Pr(y?)  >  a  at  a  state  (s,  v )  is  independent  of  v.  Thus,  we  can 
abuse  notation  and  write  (N,  s')  \=  Pr(<£>)  >  a.  In  particular,  (N,s)  (=  Pr (Xitp)  <  a  (or, 
equivalently,  (N,  s )  |=  Pr(-> Xi<p)  >  1  —  a)  if  the  probability  of  the  knowledge  algorithm 
returning  “Yes”  on  a  query  ip  is  less  than  a,  given  state  s. 

If  all  the  knowledge  algorithms  used  are  deterministic,  then  this  semantics  agrees  with 
the  semantics  given  in  Section  [21  To  make  this  precise,  note  that  if  A  is  deterministic,  then 
kd(ip,l,Vi )  =  kd(ip,  l,  v[)  for  all  v,v'  £  V .  In  this  case,  we  abuse  notation  and  write  A(ip,£). 
Proposition  3.1.  Let  N  =  (S', 7r,  Li,  . . . ,  Ln,  kf, . . . ,  kd,  v)  be  a  probabilistic  algorithmic 
knowledge,  with  Ai,...,An  deterministic.  Let  M  =  (S,  n,  L\, .  ..,Ln,  Ai, ...,  An).  If  there 
are  no  occurrences  of  Pr  in  ip  then,  for  all  s  £  S  and  all  v  £  V,  ( N ,  s,  v )  |=  (p  if  and  only  if 
(M,  s)  |=  ip. 

Thus,  derandomizers  are  not  needed  to  interpret  the  Xj  operators  if  the  knowledge 
algorithms  are  all  deterministic.  Moreover,  in  general,  derandomizers  are  necessary  only  to 
interpret  the  Pr  and  Y,  operators. 

Proposition  3.2.  Let  N  =  (S,  tt,  Li,  . . . ,  Ln,  kf, . . . ,  Ad,  v)  be  a  probabilistic  algorithmic 
knowledge  structure  and  let  M  =  (S',  tt,  L\, . . . ,  Ln,  A[, . . . ,  k'n)  be  an  algorithmic  knowledge 
structure,  where  A/l , . . . ,  k'n  are  arbitrary  deterministic  knowledge  algorithms.  If  there  are 
no  occurrences  of  Xi  and  Pr  in  ip  then,  for  all  s  £  S  and  all  v  £  V,  ( N,s,v )  | =  ip  if  and 
only  if  (M,s)  \=  ip. 

Propositions  rm  and  13.21  justify  the  decision  to  “factor  out”  the  randomization  of  the 
knowledge  algorithms  into  semantic  objects  that  are  distinct  from  the  states;  the  semantics 
of  formulas  that  do  not  depend  on  the  randomized  choices  do  not  in  fact  depend  on  those 
additional  semantic  objects. 


4.  An  Example  from  Security 

As  we  mentioned  in  the  introduction,  an  important  area  of  application  for  algorithmic 
knowledge  is  the  analysis  of  cryptographic  protocols.  In  previous  work  Halpern  and  Pucella  2002]  ,| 
we  showed  how  algorithmic  knowledge  can  be  used  to  model  the  resource  limitations  of  an 
adversary.  We  briefly  review  the  framework  of  that  paper  here. 

Participants  in  a  security  protocol  are  viewed  as  exchanging  messages  in  the  free  al¬ 
gebra  generated  by  a  set  V  of  plaintexts  and  a  set  K  of  keys,  over  abstract  operations  • 
(concatenation)  and  {]  [}  (encryption).  The  set  M  of  messages  is  the  smallest  set  that  con¬ 
tains  /C  and  V  and  is  closed  under  encryption  and  concatenation,  so  that  if  mi  and  m2  are 
in  Xi  and  k  £  /C,  then  mi  •  m2  and  -Jmi^k  are  in  Xi.  We  identify  elements  of  Xi  under  the 
equivalence  j]{]m|}k|[k-i  =  m.  We  make  the  assumption,  standard  in  the  security  literature, 
that  concatenation  and  encryption  have  enough  redundancy  to  recognize  that  a  term  is  in 
fact  a  concatenation  mi  ■  m2  or  an  encryption  -flmljT. 

In  an  algorithmic  knowledge  security  structure ,  some  of  the  agents  are  participants  in  the 
security  protocol  being  modeled,  while  other  agents  are  adversaries  that  do  not  participate 
in  the  protocol,  but  attempt  to  subvert  it.  The  adversary  is  viewed  as  just  another  agent, 
whose  local  state  contains  all  the  messages  it  has  intercepted,  as  well  as  the  keys  initially 
known  to  the  adversary,  such  as  the  public  keys  of  all  the  agents.  We  use  initkey(£)  to 
denote  the  set  of  initial  keys  known  by  an  agent  with  local  state  I  and  write  recv(m)  £  l  if 
m  is  one  of  the  messages  received  (or  intercepted  in  the  case  of  the  adversary)  by  an  agent 
with  local  state  l.  We  assume  that  the  language  includes  a  primitive  proposition  has.j(m) 
for  every  message  m,  essentially  saying  that  message  m  is  contained  within  a  message  that 
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agent  i  has  received.  Define  the  containment  relation  C  on  A4  as  the  smallest  relation 
satisfying  the  following  constraints: 

(1)  m  C  m; 

(2)  if  m  C  mi,  then  m  C  mi  •  m2; 

(3)  if  m  C  m2,  then  m  C  mi  •  m2; 

(4)  if  m  C  mi,  then  m  C  |mi|[k- 

Formally,  has,;(m)  is  true  at  a  local  state  £  if  m  Cm7  for  some  message  m7  such  that 
recv(m')  G  £. 

Clearly,  the  adversary  may  not  explicitly  know  that  he  has  a  given  message  if  that 
message  is  encrypted  using  a  key  that  the  adversary  does  not  know.  To  capture  these 
restrictions,  Dolev  and  Yao  gave  a  now-standard  description  of  capabilities  of  adver¬ 

saries.  Succinctly,  a  Dolev- Yao  adversary  can  compose  messages,  replay  them,  or  decipher 
them  if  he  knows  the  right  keys,  but  cannot  otherwise  “crack”  encrypted  messages.  The 
Dolev- Yao  model  can  be  formalized  by  a  relation  H  \~DY  m  between  a  set  H  of  messages  and 
a  message  m.  (Our  formalization  is  equivalent  to  many  other  formalizations  of  Dolev- Yao 
in  the  literature,  and  is  similar  in  spirit  to  that  of  Paulson  j  1 098 j . )  Intuitively,  H  hDY  m 
means  that  an  adversary  can  “extract”  message  m  from  a  set  of  received  messages  and  keys 
H,  using  the  allowable  operations.  The  derivation  is  defined  using  the  following  inference 
rules: 

m  £  H  H\~dy{ |mfrk  H\-dy  k"1  H  \~DY  mi  •  m2  H  mi  •  m2 

H  I  £)Y  m  H  I  DY  m  H  I  DY  H  I  DY  ^2 

where  k”1  represents  the  key  used  to  decrypt  messages  encrypted  with  k. 

We  can  encode  these  capabilities  via  a  knowledge  algorithm  ADY  for  the  adversary 
as  agent  i.  Intuitively,  the  knowledge  algorithm  AD'  simply  implements  a  search  for  the 
derivation  of  a  message  m  from  the  messages  that  the  agent  has  received  and  the  initial  set 
of  keys,  using  the  rules  given  above.  The  most  interesting  case  in  the  definition  of  ADY  is 
when  the  formula  is  hasj(m).  To  compute  A?Y(haSi(m),  £,  s),  the  algorithm  simply  checks, 
for  every  message  rrV  received  by  the  adversary,  whether  m  is  a  sub  message  of  m',  according 
to  the  keys  that  are  known  to  the  adversary  (given  by  the  function  keysof).  Checking 
whether  m  is  a  sub  message  of  rrV  is  performed  by  a  function  submsg,  which  can  take  apart 
messages  created  by  concatenation,  or  decrypt  messages  as  long  as  the  adversary  knows  the 
decryption  key.  (The  function  submsg  basically  implements  the  inference  rules  for  \~DY.) 
AhY(haSj(m),  £,  s)  is  defined  by  the  following  algorithm: 

if  m  G  %nitkeys[£ )  then  return  “Yes” 

K  =  keysof  (£) 
for  each  recv(m/)  G  t  do 
if  submsg( m,  rr/,  K)  then 
return  “Yes” 
return  “?”. 


Note  that  the  algorithm  does  not  use  the  input  s.  Further  details  can  be  found  in  Halpern  and  Pucella  2002 


where  it  is  also  shown  that  A^Y  is  a  sound  knowledge  algorithm  that  captures  the  Dolev- Yao 
adversary  in  the  following  sense: 
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Proposition  4.1.  (Halpern  and  Pucella  2002 1  If  M  =  (S,  tt,  Li,  . . . ,  Ln,  Ai, . . . ,  An)  is  an 
algorithmic  knowledge  security  structure  with  an  adversary  as  agent  i  and  A,  =  A°Y,  then 
( M,s )  |=  Xj(hasj(m))  if  and  only  if  {m  |  recv(m)  E  Lj(s)}  U  initkeys{I)  hDY  m.  Moreover, 
if(M,s )  |=  Xj(hasj(m))  f/ien  ( M,s )  [=  hasj(m). 

The  Dolev-Yao  algorithm  is  deterministic.  It  does  not  capture,  for  example,  an  ad¬ 
versary  who  guesses  keys  in  an  effort  to  crack  an  encryption.  Assume  that  the  key  space 
consists  of  finitely  many  keys,  and  let  guesskeys{r)  return  r  of  these,  chosen  uniformly  at 
random.  Let  A°Y+rgM  be  the  result  of  modifying  AfY  to  take  random  guessing  into  account 
(the  rg  stands  for  random  guess),  so  that  A°Y+rs(r,(haSj(m),  s)  is  defined  by  the  following 
algorithm: 

if  m  E  initkeys{I)  then  return  “Yes” 

K  =  keysof(£)  U  guesskeysfr ) 
for  each  recv(m/)  E  I  do 
if  submsg{ m,  m ' ,K)  then 
return  “Yes” 
return  “?”. 

(As  before,  the  algorithm  does  not  use  the  input  s.)  Using  a/' +rg(r) ,  the  adversary  gets 
to  work  with  whatever  keys  he  already  had  available,  all  the  keys  he  can  obtain  using 
the  standard  Dolev-Yao  algorithm,  and  the  additional  r  randomly  chosen  keys  returned  by 
guesskeys(r). 

Of  course,  if  the  total  number  of  keys  is  large  relative  to  r,  making  r  random  guesses 
should  not  help  much.  Our  framework  lets  us  make  this  precise. 

Proposition  4.2.  Suppose  that  N  =  L\, . . . ,  Ln,  Ad, . . . ,  kdn,  u)  is  a  probabilistic  algo¬ 

rithmic  knowledge  security  structure  with  an  adversary  as  agent  i  and  that  A,  =  A°'+rsM. 
Let  K  be  the  number  of  distinct  keys  used  in  the  messages  in  the  adversary's  local  state  I 
(i.e.,  the  number  of  keys  used  in  the  messages  that  the  adversary  has  intercepted  at  a  state 
s  with  Li(s)  =  I).  Suppose  that  K/\IC\  <  1/2  and  that  v  is  the  uniform  distribution  on  se¬ 
quences  of  coin  tosses.  If  (N,s,v)  \=  -i/\,;A/(haSj(m)),  then  ( N,s,v )  |=  Pr(Yj(haSj(m)))  < 
1  —  e.~2rKl\K-\ .  Moreover,  if(N,s,v)  \=  Yj(haSj(m))  then  ( N,s,v )  |=  hasj(m). 

Proposition  m  says  that  what  we  expect  to  be  true  is  in  fact  true:  random  guessing 
of  keys  is  sound,  but  it  does  not  help  much  (at  least,  if  the  number  of  keys  guessed  is  a 
small  fraction  of  the  total  numbers  of  keys).  If  it  is  possible  that  the  adversary  does  not 
have  algorithmic  knowledge  of  m,  then  the  probability  that  he  has  algorithmic  knowledge 
is  low.  While  this  result  just  formalizes  our  intuitions,  it  does  show  that  the  probabilistic 
algorithmic  knowledge  framework  has  the  resources  to  formalize  these  intuitions  naturally. 

5.  Probabilistic  Algorithmic  Knowledge 

While  the  “guessing”  extension  of  the  Dolev-Yao  algorithm  considered  in  the  previous 
section  is  sound,  we  are  often  interested  in  randomized  knowledge  algorithms  that  may 
sometimes  make  mistakes.  We  consider  a  number  of  examples  in  this  section,  to  motivate 
our  approach. 

First,  suppose  that  Bob  knows  (or  believes)  that  a  coin  is  either  fair  or  double-headed, 
and  wants  to  determine  which.  He  cannot  examine  the  coin,  but  he  can  “test”  it  by  having 
it  tossed  and  observing  the  outcome.  Let  dh  be  a  proposition  that  is  true  if  and  only 
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if  the  coin  is  double-headed.  Bob  uses  the  following  dh-complete  randomized  knowledge 
algorithm  ABob:  when  queried  about  dh,  the  algorithm  “tosses”  the  coin,  returning  “Yes” 
if  the  coin  lands  heads  and  “No”  if  the  coin  lands  tails.  It  is  not  hard  to  check  that  if  the 
coin  is  double-headed,  then  Agob  answers  “Yes”  with  probability  1  (and  hence  “No”  with 
probability  0);  if  the  coin  is  fair,  then  Agob  answers  “Yes”  with  probability  0.5  (and  hence 
“No”  with  probability  0.5  as  well).  Thus,  if  the  coin  fair,  there  is  a  chance  that  Agob  will 
make  a  mistake,  although  we  can  make  the  probability  of  error  arbitrarily  small  by  applying 
the  algorithm  repeatedly  (alternatively,  by  increasing  the  number  of  coin  tosses  performed 
by  the  algorithm). 

Second,  consider  a  robot  navigating,  using  a  probabilistic  sensor.  This  sensor  returns 
the  distance  to  the  wall  in  front  of  the  robot,  within  some  tolerance.  For  simplicity,  suppose 
that  if  the  wall  is  at  distance  m,  then  the  sensor  will  return  a  reading  of  m— 1  with  probability 
1/4,  a  reading  of  m  with  probability  1/2,  and  a  reading  of  m  +  1  with  probability  1/4.  Let 
wall(m)  be  a  proposition  true  at  a  state  if  and  only  if  the  wall  is  at  distance  at  most  m  in 
front  of  the  robot.  Suppose  that  the  robot  uses  the  following  knowledge  algorithm  Ap^ot  to 
answer  queries.  Given  query  wall(m),  A^obot  observes  the  sensor.  Suppose  that  it  reads  r. 
If  r  <  m,  the  algorithm  returns  “Yes”,  otherwise,  it  returns  “No”.  It  is  not  hard  to  check 
that  if  the  wall  is  actually  at  distance  less  than  or  equal  to  m,  then  AR0bot  answers  “Yes” 
to  a  query  wall(m)  with  probability  <3/4  (and  hence  “No”  with  probability  >1/4).  If 
the  wall  is  actually  at  distance  greater  than  m,  then  Aj^obot  answers  “Yes”  with  probability 
<1/4  (and  hence  “No”  with  a  probability  >  1/4). 

There  are  two  ways  of  modeling  this  situation.  The  first  (which  is  what  we  are  implicitly 
doing)  is  to  make  the  reading  of  the  sensor  part  of  the  knowledge  algorithm.  This  means 
that  the  actual  reading  is  not  part  of  the  agent’s  local  state,  and  that  the  output  of  the 
knowledge  algorithm  depends  on  the  global  state.  The  alternative  would  have  been  to  model 
the  process  of  reading  the  sensor  in  the  agent’s  local  state.  In  that  case,  the  output  of  the 
knowledge  algorithm  would  depend  only  on  the  agent’s  local  state.  There  is  a  tradeoff  here. 
While  on  the  one  hand  it  is  useful  to  have  the  flexibility  of  allowing  the  knowledge  algorithm 
to  depend  on  the  global  state,  ultimately,  we  do  not  want  the  knowledge  algorithm  to  use 
information  in  the  global  state  that  is  not  available  to  the  agent.  For  example,  we  would 
not  want  the  knowledge  algorithm’s  answer  to  depend  on  the  actual  distance  to  the  wall 
(beyond  the  extent  to  which  the  sensor  reading  depends  on  the  actual  distance).  It  is  up 
to  the  modeler  to  ensure  that  the  knowledge  algorithm  is  appropriate.  A  poor  model  will 
lead  to  poor  results. 

Finally,  suppose  that  Alice  has  in  her  local  state  a  number  n  >  2.  Let  prime  be  a 
proposition  true  at  state  s  if  and  only  if  the  number  n  in  Alice’s  local  state  is  prime. 
Clearly,  Alice  either  (implicitly)  knows  prime  or  knows  -iprime.  However,  this  is  implicit 
knowledge.  Suppose  that  Alice  uses  Rabin’s  |198()|  primality-testing  algorithm  to  test  if  n 
is  prime.  That  algorithm  uses  a  (polynomial-time  computable)  predicate  P(n,a )  with  the 
following  properties,  for  a  natural  number  n  and  1  <  a  <  n  —  1: 

(1)  P(n,  a)  <5  {0, 1}; 

(2)  if  n  is  composite,  P(n,  a)  =  1  for  at  least  n/2  choices  of  a; 

(3)  if  n  is  prime,  P(n,  a)  =  0  for  all  a. 

Thus,  Alice  uses  the  following  randomized  knowledge  algorithm  A  Alice :  when  queried  about 
prime,  the  algorithm  picks  a  number  a  at  random  between  0  and  the  number  n  in  Alice’s 
local  state;  if  P(n,a)  =  1,  it  says  “No”  and  if  P(n,a)  =  0,  it  says  “Yes”.  (It  is  irrelevant  for 
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our  purposes  what  the  algorithm  does  on  other  queries.)  It  is  not  hard  to  check  that  AAiice 
has  the  following  properties.  If  the  number  n  in  Alice’s  local  state  is  prime,  then  AAiice 
answers  “Yes”  to  a  query  prime  with  probability  1  (and  hence  “No”  to  the  same  query  with 
probability  0).  If  n  is  composite,  AAiice  answers  “Yes”  to  a  query  prime  with  probability 
<1/2  and  “No”  with  probability  >  1/2.  Thus,  if  n  is  composite,  there  is  a  chance  that 
AAiice  wiH  make  a  mistake,  although  we  can  make  the  probability  of  error  arbitrarily  small  by 
applying  the  algorithm  repeatedly.  While  this  problem  seems  similar  to  the  double-headed 
coin  example  above,  note  that  we  have  only  bounds  on  the  probabilities  here.  The  actual 
probabilities  corresponding  to  a  particular  number  n  depend  on  various  number  theoretic 
properties  of  that  number.  We  return  to  this  issue  in  Section  [5.21 

Randomized  knowledge  algorithms  like  those  in  the  examples  above  are  quite  common 
in  the  literature.  They  are  not  sound,  but  are  “almost  sound”.  The  question  is  what  we 
can  learn  from  such  an  “almost  sound”  algorithm.  Returning  to  the  first  example,  we  know 
the  probability  that  Agob  says  “Yes”  (to  the  query  dh)  given  that  the  coin  is  double-headed; 
what  we  are  interested  in  is  the  probability  that  the  coin  is  double-headed  given  that  Agob 
says  “Yes”.  (Of  course,  the  coin  is  either  double-headed  or  not.  However,  if  Bob  has  to 
make  decisions  based  on  whether  the  coin  is  double-headed,  it  seems  reasonable  for  him 
to  ascribe  a  subjective  probability  to  the  coin  being  double-headed.  It  is  this  subjective 
probability  that  we  are  referring  to  here.) 

Taking  “dh”  to  represent  the  event  “the  coin  is  double-headed”  (thus,  the  proposition 
dh  is  true  at  exactly  the  states  in  dh),  by  Bayes’  rule, 


Pr(dh  |  Agob  says  “Yes”) 


Pr  (Agob  says  “Yes”  |  dh)Pr(dh) 
Pr(ABob  says  “Yes”) 


The  only  piece  of  information  in  this  equation  that  we  have  is  Pr(Agob  says  “Yes”  |  dh).  If 
we  had  Pr(dh),  we  could  derive  Pr(Agob  says  “Yes”).  However,  we  do  not  have  that  infor¬ 
mation,  since  we  did  not  assume  a  probability  distribution  on  the  choice  of  coin.  Although 
we  do  not  have  the  information  needed  to  compute  Pr(dh  |  Agob  says  “Yes”),  there  is  still 
a  strong  intuition  that  if  A/dh  holds,  this  tells  us  something  about  whether  the  coin  is 
double-headed.  How  can  this  be  formalized? 


5.1.  Evidence.  Intuitively,  the  fact  that  Xi<p  holds  provides  “evidence”  that  (p  holds.  But 
what  is  evidence?  There  are  a  number  of  definitions  in  the  literature.  They  all  essentially 
give  a  way  to  assign  a  “weight”  to  different  hypotheses  based  on  an  observation;  they 
differ  in  exactly  how  they  assign  the  weight  (see  |Kyburg  1983  for  a  survey).  Some  of 
these  approaches  make  sense  only  if  there  is  a  probability  distribution  on  the  hypotheses. 
Since  this  is  typically  not  the  case  in  the  applications  of  interest  to  us  (for  example,  in 
the  primality  example,  we  do  not  want  to  assume  a  probability  on  the  input  n),  we  use  a 
definition  of  evidence  given  by  Shafer  19821  and  Walley  [M7I.  which  does  not  presume  a 
probability  on  hypotheses. 

We  start  with  a  set  TL  of  hypotheses,  which  we  take  to  be  mutually  exclusive  and 
exhaustive;  thus,  exactly  one  hypothesis  holds  at  any  given  time.  For  the  examples  of  this 
paper,  the  hypotheses  of  interest  have  the  form  TL  =  {ho,~~ho},  where  the  hypothesis  ->ho 
is  the  negation  of  hypothesis  ho .  Intuitively,  this  is  because  we  want  to  reason  about  the 
evidence  associated  with  a  formula  or  its  negation  (see  Section  15.31).  For  example,  if  ho  is 
“the  coin  is  double-headed”,  then  —>ho  is  “the  coin  is  not  double-headed”  (and  thus,  if  there 
are  only  two  kinds  of  coins,  double-headed  and  fair,  then  — >/io  is  “the  coin  is  fair”).  We  are 
given  a  set  O  of  observations,  which  can  be  understood  as  outcomes  of  experiments  that  we 
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can  make.  Assume  that  for  each  hypothesis  h  €  TL  there  is  a  probability  space  (O,  2° ,  ph). 
Intuitively,  p h{ob )  is  the  probability  of  ob  given  that  hypothesis  h  holds.  While  this  looks 
like  a  conditional  probability,  notice  that  it  does  not  require  a  probability  on  TL.  Taking 
A (O)  to  denote  the  set  of  probability  measures  on  O,  define  an  evidence  space  to  be  a  tuple 
£  =  (TL,0,LF),  where  TL,  O,  and  T  :  TL  — *  A (O).  Thus,  T  associates  with  each  hypothesis 
a  probability  on  observations  (intuitively,  the  probability  that  various  observations  are  true 
given  that  the  hypothesis  holds).  We  often  denote  LF(h)  as  p^-  For  an  evidence  space  £, 
the  weight  that  the  observation  ob  lends  to  hypothesis  h  €  TL,  written  W£(ob,h ),  is 


ws(ob,h )  = 


I Xh(ob ) 


(5.1) 


Hh'W 

Equation  (EH)  does  not  define  a  weight  W£  for  an  observation  ob  such  that  'Yhh&H  Thiob)  =  0. 
Intuitively,  this  means  that  the  observation  ob  is  impossible.  In  the  literature  on  confirma¬ 
tion  theory  it  is  typically  assumed  that  this  case  never  arises.  More  precisely,  it  is  assumed 
that  all  observations  are  possible,  so  that  for  every  observation  ob,  there  is  an  hypothesis 
h  such  that  ph(ob)  >  0.  In  our  case,  making  this  assumption  is  unnatural.  We  want  to 
view  the  answers  given  by  knowledge  algorithms  as  observations,  and  it  seems  perfectly 
reasonable  to  have  a  knowledge  algorithm  that  never  returns  “No”,  for  instance.  As  we 
shall  see  (Proposition  15.  l|).  the  fact  that  the  weight  of  evidence  is  undefined  in  the  case  that 
YlheH  Phiob)  =  0  is  not  a  problem  in  our  intended  application,  thanks  to  our  assumption 
that  v  does  not  assign  zero  probability  to  the  nonempty  sets  of  sequences  of  coin  tosses  that 
determine  the  result  of  the  knowledge  algorithm. 

Observe  that  the  measure  W£  always  lies  between  0  and  1,  with  1  indicating  that  the 
full  weight  of  the  evidence  of  observation  ob  is  provided  to  the  hypothesis.  While  the  weight 
of  evidence  W£  looks  like  a  probability  measure  (for  instance,  for  each  fixed  observation  ob 
for  which  YlheH  Th{ob)  >  0,  the  sum  Ylh&-Lw^^0bi  b)  *s  1)>  one  should  not  interpret  it  as  a 
probability  measure.  It  is  simply  a  way  to  assign  a  weight  to  hypotheses  given  observations. 
It  is  possible  to  interpret  the  weight  function  w  as  a  prescription  for  how  to  update  a 
prior  probability  on  the  hypotheses  into  a  posterior  probability  on  those  hypotheses,  after 
having  considered  the  observations  made.  We  do  not  focus  on  these  aspects  here;  see 
|Halpern  and  Pucella  20031  for  more  details. 

For  the  double- headed  coin  example,  the  set  TL  of  hypotheses  is  {d h ,  —id h } .  The  obser¬ 
vations  O  are  simply  the  possible  outputs  of  the  knowledge  algorithm  Aeob  on  the  formula 
dh,  namely,  {“Yes”,  “No”}.  From  the  discussion  following  the  description  of  the  example, 
it  follows  that  /rdh(“Yes”)  =  1  and  /ucjh(“No”)  =  0,  since  the  algorithm  always  says  “Yes” 
when  the  coin  is  double- headed.  Similarly,  /U^dh(“Yes”)  is  the  probability  that  the  algo¬ 
rithm  says  “Yes”  if  the  coin  is  not  double-headed.  By  assumption,  the  coin  is  fair  if  it  is 
not  double-headed,  so  /x_,dh(“Yes”)  =  1/2  and  p^ h(“No”)  =  1/2.  Define  Y/dh)  =  p^  and 
.F(-idh)  =  /U^dhi  and  let 


£  =  ({dh,  — idh},  {“Yes” ,  “No”}, LF). 


It  is  easy  to  check  that  W£ (“Yes” ,  dh)  =  2/3  and  W£(“ Yes” ,  — idh)  =  1/3.  Intuitively,  a  “Yes” 
answer  to  the  query  dh  provides  more  evidence  for  the  hypothesis  dh  than  the  hypothesis 
-^dh.  Similarly,  u>(“No”,dh)  =  0  and  u;(“No” ,  -idh)  =  1.  Thus,  an  output  of  “No”  to  the 
query  dh  indicates  that  the  hypothesis  -idh  must  hold. 

This  approach,  however,  is  not  quite  sufficient  to  deal  with  the  sensor  example  because, 
in  that  example,  the  probability  of  an  observation  does  not  depend  solely  on  whether  the 
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hypothesis  is  true  or  false.  The  probability  of  the  algorithm  answering  “Yes”  to  a  query 
wall (10)  when  wall (10)  is  true  depends  on  the  actual  distance  m  to  the  wall: 

•  if  m  <  9,  then  /iWaii(io)(“Yes”)  =  1  (and  thus  /uwaM(10)(“No”)  =  0); 

•  if  m  =  10,  then  /iwaii(io)(“Yes”)  =  3/4  (and  thus  ^waii(io)(“No”)  =  1/4). 

Similarly,  the  probability  of  the  algorithm  answering  “Yes”  to  a  query  wall (10)  in  a  state 
where  -wall  (10)  holds  depends  on  m  in  the  following  way: 

•  if  777,  =  11,  then  /^waii(i0)(“Yes”)  =  1/4; 

•  if  m  >  12,  then  Afc-,waii(10)  ( “Yes” )  =  0. 

It  does  not  seem  possible  to  capture  this  information  using  the  type  of  evidence  space  defined 
above.  In  particular,  we  do  not  have  a  single  probability  measure  over  the  observations  given 
a  particular  hypothesis.  One  reasonable  way  of  capturing  the  information  is  to  associate  a 
set  of  probability  measures  on  observations  with  each  hypothesis;  intuitively,  these  represent 
the  possible  probabilities  on  the  observations,  depending  on  the  actual  state. 

To  make  this  precise,  define  a  generalized  evidence  space  to  be  a  tuple  S  =  {Ti,0,  J7), 
where  now  T  :  hi  —>  2A(°).  We  require  lF(h)  /  0  for  at  least  one  h  £  hi.  What  is 
the  most  appropriate  way  to  define  weight  of  evidence  given  sets  of  probability  measures? 
As  a  first  step,  consider  the  set  of  all  possible  weights  of  evidence  that  are  obtained  by 
taking  any  combination  of  probability  measures,  one  from  each  set  J~(h)  (provided  that 
IF(h)  ^  0).  This  gives  us  a  range  of  possible  weights  of  evidence.  We  can  then  define  upper 
and  lower  weights  of  evidence,  determined  by  the  maximum  and  minimum  values  in  the 
range,  somewhat  analogous  to  the  notions  of  upper  and  lower  probability  |Halpern  2003  . 
(Given  a  set  V  of  probability  measures,  the  lower  probability  of  a  set  U  is  inf p(U)\  its 
upper  probability  is  sup^g-p  p(U).)  Let 


Ws(ob,h) 


_ llh(°b) _ 

^2h'&H,T(h')^0  Hh'{ob) 


jifeef(li),ftTf(/i'),  ^2  Hh'(ob)^  0>. 

h'&n 

F{h')^0 


Thus,  W£(ob,  h)  is  the  set  of  possible  weights  of  evidence  for  the  hypothesis  h  given  by  ob. 
Define  the  lower  weight  of  evidence  function  by  taking  Wg^ob,  h )  =  inf  W ’s(ob,  h);  simi¬ 
larly,  define  the  upper  weight  of  evidence  function  w£  by  taking  ui£(ob ,  h)  =  sup  Ws(ob,  h). 
If  yVe(ob,h)  =  0,  which  will  happen  either  if  J-(h)  =  0  or  if  Yhh'eH  T(h')^0  Adi'(0^)  =  0 
for  all  choices  of  £  F{h')  for  iF(h')  /  0,  then  we  define  w^^ob,  h )  =  wg(ob ,  h)  =  0.  We 
show  in  Proposition  15.1 1  that,  in  the  special  case  where  J-(h)  is  a  singleton  for  all  h  (which 
has  been  the  focus  of  all  previous  work  in  the  literature),  Wg(ob,  h)  is  a  singleton  under  our 
assumptions.  In  particular,  the  denominator  is  not  0  in  this  case.  Of  course,  if  T{h)  =  {pn} 
for  all  hypotheses  h  £  7i,  then  w£  =  w£  =  w£. 

Lower  and  upper  evidence  can  be  used  to  model  the  examples  at  the  beginning  of  this 
section.  In  the  sensor  example,  with  hi  =  {wall(10),  ->wall(10)},  there  are  two  probability 
measures  associated  with  the  hypothesis  wall (10),  namely, 


/A/vall(10),<9(“Yes”)  —  1 

/A/vaii(io),=io  ( “Yes” )  =  3/4; 
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similarly,  there  are  two  probability  measures  associated  with  the  hypothesis  -wall (10), 
namely 

^-.waii(io),=ii(“Yes”)  =  1/4 

At-.wall(10),>12(“Yes”)  =  0. 

Let  £  be  the  corresponding  eneralized  evidence  space.  It  is  easy  to  check  that 

Wf(“Yes”,wall(10))  =  {4/5, 1,3/4}, 

and  thus 

W£ ( “Yes” ,  wall (10))  =  3/4  and  uT£(“Yes” ,  wall(10))  =  1. 

Indeed,  using  yUWaii(10),<9  and  A^waii(io),=li  gives  4/5;  using  /^waM(i0),=io  and  M-waii(io),=n 
gives  3/4;  and  using  either  /rwa||(i0),<9  or  Atwall(io),=io  with  /^Wall(io),>i2  gives  1.  Similarly, 

Wf(“Yes”,-wall(10))  =  {1/5, 1/4,0}, 

and  thus 

w£(uYes'" ,  =wall(10))  =  0  and  nJ£-(“Yes” ,  =wall(10))  =  1/4. 

In  particular,  if  the  algorithm  answers  “Yes”  to  a  query  wall(10),  the  evidence  supports  the 
hypothesis  that  the  wall  is  indeed  at  a  distance  less  than  10  from  the  robot. 

The  primality  example  can  be  dealt  with  in  the  same  way.  Take  7i  =  {prime,  -iprime}. 
There  is  a  single  probability  measure  /ipr ime  associated  with  the  hypothesis  prime,  namely 
/iprime (“Yes”)  =  L  intuitively,  if  the  number  is  prime,  the  knowledge  algorithm  always 
returns  the  right  answer.  In  contrast,  there  are  a  number  of  different  probability  mea¬ 
sures  /i^prime.n  associated  with  the  hypothesis  -iprime,  one  per  composite  number  n,  where 
we  take  /iprime, n(  “Yes”)  to  be  the  probability  that  the  algorithm  says  “Yes”  when  the 
composite  number  n  is  in  Alice’s  local  state.  Note  that  this  probability  is  1  minus  the 
fraction  of  “witnesses”  a  <  n  such  that  P(n,a )  =  1.  The  fraction  of  witnesses  depends  on 
number-theoretic  properties  of  n,  and  thus  may  be  different  for  different  choices  of  compos¬ 
ite  numbers  n.  Moreover,  Alice  is  unlikely  to  know  the  actual  probability  /t^prj me,n-  As  we 
mentioned  above,  it  has  been  shown  that  fi^Pr\me,n  <  1/2  for  all  composite  n,  but  Alice  may 
not  know  any  more  than  this.  Nevertheless,  for  now,  we  assume  that  Alice  is  an  “ideal” 
agent  who  knows  the  set  {/r-,prime,n  |  n  is  composite}.  (Indeed,  in  the  standard  Kripke 
structure  framework  for  knowledge,  it  is  impossible  to  assume  anything  else!)  We  consider 
how  to  model  the  set  of  probabilities  used  by  a  “less-than-ideal”  agent  in  Section  15.21  Let 
£  be  the  corresponding  generalized  evidence  space.  Then 

Wg( “Yes”,  prime)  =  {1/(1  +  /iprime, n(  “Yes”))  |  n  composite}. 

Since  /i-,prime,n(“Yes”)  <  1/2  for  all  composite  n,  it  follows  that  Wg ( “Yes” ,  prime)  >  2/3. 
Similarly, 

LVf( “Yes”, -.prime)  =  {iprime, n(“Yes”)/(/r^prime,n(“Yes”)  +  1)  |  n  composite}. 

Since  /i^prime,n(“Yes”)  <  1/2  for  all  composite  n,  we  have  that  w? ( “Yes” ,  -.prime)  <  1/3. 
Therefore,  if  the  algorithm  answers  “Yes”  to  a  query  prime,  the  evidence  supports  the 
hypothesis  that  the  number  is  indeed  prime. 

Note  that,  in  modeling  this  example,  we  have  assumed  that  the  number  n  is  not  in 
Alice’s  local  state  and  that  Alice  knows  the  fraction  of  witnesses  a  for  each  composite 
number  n.  This  means  that  the  same  set  of  probabilities  used  by  Alice  for  all  choices  of  n 
(since  the  set  of  probabilities  used  depends  only  on  Alice’s  local  state),  and  is  determined  by 
the  set  of  possible  fraction  of  elements  <  n  that  are  witnesses,  for  each  composite  number  n. 
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Assuming  that  n  is  in  Alice’s  local  state  (which  is  actually  quite  a  reasonable  assumption!) 
and  that  Alice  does  not  know  the  fraction  of  numbers  less  than  n  that  are  witnesses  adds 
new  subtleties;  we  consider  them  in  Section  15.21 

5.2.  Evidence  for  Randomized  Knowledge  Algorithms.  We  are  now  ready  to  discuss 
randomized  knowledge  algorithms.  What  does  a  “Yes”  answer  to  a  query  p  given  by  an 
“almost  sound”  knowledge  algorithm  tell  us  about  pi  As  the  discussion  in  Section  15.11 
indicates,  a  “Yes”  answer  to  a  query  p  provides  evidence  for  the  hypotheses  <p  and  —> ip. 
This  can  be  made  precise  by  associating  an  evidence  space  with  every  state  of  the  model  to 
capture  the  evidence  provided  by  the  knowledge  algorithm.  To  simplify  the  presentation, 
we  restrict  our  attention  to  knowledge  algorithms  that  are  (^-complete.  (While  it  is  possible 
to  deal  with  general  knowledge  algorithms  that  also  can  return  “?”  using  these  techniques, 
we  already  saw  that  the  logic  does  not  let  us  distinguish  between  a  knowledge  algorithm 
returning  “No”  and  a  knowledge  algorithm  returning  “?” ;  they  both  result  in  lack  of  algo¬ 
rithmic  knowledge.  In  the  next  section,  where  we  define  the  notion  of  a  reliable  knowledge 
algorithm,  reliability  will  be  characterized  in  terms  of  algorithmic  knowledge,  and  thus  the 
definition  will  not  distinguish  between  a  knowledge  algorithm  returning  “No”  or  “?”.  In 
order  to  establish  a  link  between  the  notion  of  reliability  and  evidence,  it  is  convenient  to 
either  consider  ^-complete  algorithms,  or  somehow  identify  the  answers  “No”  and  “?” .  We 
choose  the  former.)  Note  that  the  knowledge  algorithms  described  in  the  examples  at  the 
beginning  of  this  section  are  all  complete  for  their  respective  hypotheses.  We  further  assume 
that  the  truth  of  p  depends  only  on  the  state,  and  not  on  coin  tosses,  that  is,  p  does  not 
contain  occurrences  of  the  A,  operator. 

Our  goal  is  to  associate,  with  every  local  state  £  of  agent  i  in  N  an  evidence  space  over 
the  hypotheses  {p,-<p}  and  the  observations  {“Yes”,  “No”,  “?”}.  Let  Sg  =  {s  |  Li(s)  =  £} 
be  the  set  of  states  where  agent  i  has  local  state  i.  At  every  state  s  of  Sg,  let  pSjlf(ob)  = 
u({v'  |  kf(<p,£,s,v!j)  =  ob }).  Intuitively,  fiSjV,  gives  the  probability  of  observing  “Yes”  and 
“No”  in  state  s.  Let  Sgt(f  =  {s  E  Sg  \  ( N,s j  |=  p}  and  let  Sg^  =  {s  E  Sg  |  ( N,s )  \=  ~'p\. 
(Recall  that  p  depends  only  on  the  state,  and  not  on  the  outcome  of  the  coin  tosses.)  Define 
J-g.ipip)  =  {ps,if  |  s  E  SgtV>}  and  Tg.^-'p)  =  {ps,<p  \  s  E  Sg^};  then  the  evidence  space  is 

W  =  ({</>, -¥?},  {“Yes”,  “No”,  “?”},^). 

(We  omit  the  “?”  from  the  set  of  possible  observation  if  the  knowledge  algorithm  is  p- 
complete,  as  is  the  case  in  the  three  examples  given  at  the  beginning  of  this  section.)  Since 
the  agent  does  not  know  which  state  s  E  Sg  is  the  true  state,  he  must  consider  all  the 
probabilities  in  £Fgi(p(p)  and  Fg^(->p)  in  his  evidence  space. 

We  can  now  make  precise  the  claim  at  which  we  have  been  hinting  throughout  the 
paper.  Under  our  assumptions,  for  all  evidence  spaces  of  the  form  £gitu>g  that  arise  in  this 
construction,  and  all  observations  ob  that  can  be  made  in  local  state  l,  there  must  be  some 
expression  in  Wsk.  v((ob,h)  with  a  nonzero  denominator.  Intuitively,  this  is  because  if  ob 
is  observed  at  some  state  s  such  that  Li(s)  =  t,  our  assumptions  ensure  that  /rSjV(o6)  >  0. 
In  other  words,  observing  ob  means  that  the  probability  of  observing  ob  must  be  greater 
than  0. 

Proposition  5.1.  For  all  probabilistic  algorithmic  knowledge  structures  N,  agents  i,  for¬ 
mulas  p,  and  local  states  £  of  agent  i  that  arise  in  N,  if  ob  is  a  possible  output  of  i ’s 
knowledge  algorithm  A' 1  in  local  state  £  on  input  p,  then  there  exists  a  probability  measure 
fj,  E  £Fg^(p)  U  £Fg)lf>(-^p)  such  that  p{ob)  >  0. 
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In  particular,  it  follows  from  Proposition  15.1  [that,  under  our  assumptions,  the  evidence 
function  is  always  defined  in  the  special  case  where  J-£^(h)  is  a  singleton  for  all  hypotheses 
h. 

To  be  able  to  talk  about  evidence  within  the  logic,  we  introduce  operators  to  capture 
the  lower  and  upper  evidence  provided  by  the  knowledge  algorithm  of  agent  i,  E Vj(^)  and 
Evj (<£>),  read  “i’s  lower  (resp.,  upper)  weight  of  evidence  for  </?”,  with  semantics  defined  as 
follows: 

(N,s,v)  h  Ev*M  >  a  if  w£ki  v  Li(g)(kf{ip,  Li(s),  s,Vi),  ip)  >  a 

(■ N,s,v )  1=  Evi(ip)  >  a  if  w£h  ^L.M(Af{ip,  Li(s),s,  Vi),  (p)  >  a. 

We  similarly  define  ( N ,  s,  v)  \=  Evj(^>)  <  ct,  ( N ,  s,  v)  \=  Evj(^)  <  a,  (N,  s ,  v)  \=  Evj(^)  =  a , 

and  ( N,s,v )  |=  Ev*(^)  =  a.  By  Proposition  15.1 1  these  formulas  are  all  well  defined. 

This  definition  of  evidence  has  a  number  of  interesting  properties.  For  instance,  ob¬ 
taining  full  evidence  in  support  of  a  formula  ip  essentially  corresponds  to  establishing  the 
truth  of  ip. 

Proposition  5.2.  For  all  probabilistic  algorithmic  knowledge  structures  N,  we  have 

N  |=EyjO)  =  1  =►  P>- 

Suppose  that  we  now  apply  the  recipe  above  the  derive  the  evidence  spaces  for  the  three 
examples  at  the  beginning  of  this  section.  For  the  double-headed  coin  example,  consider  a 
structure  N  with  two  states  si  and  S2,  where  the  coin  is  double- headed  at  state  si  and  fair 
at  state  s 2,  so  that  ( N ,  si,  v)  |=  dh  and  (N,  S2,v)  \=  — >d h .  Since  Bob  does  not  know  whether 
the  coin  is  fair  or  double-headed,  it  seems  reasonable  to  assume  that  Bob  has  the  same  local 
state  £0  at  both  states.  Thus,  Se0  =  {si,S2},  S^dh  =  {si } ,  and  S^^dhl^}-  Since  we  are 
interested  only  in  the  query  dh  and  there  is  only  one  local  state,  we  can  consider  the  single 
evidence  space 

£  =  ({dh,  ^dh},  {“Yes” ,  “No”},  fF dh ) , 

where 

-^dh(dh)  =  {MsJ 
-^dhl^dh)  =  {pS2} 

Fsi  ( “Yes” )  =  1 
hs2  ( “Yes” )  =  1/2. 

We  can  check  that,  for  all  states  (s,u)  where  ABob(dh,  £0,  s,  UBob)  =  “Yes”,  ( N,s,v )  |= 
Ev(dh)  =  2/3  and  ( N,s,v )  \=  Ev(dh)  =  2/3,  while  at  all  states  ( s,v )  where 

ABob(dh,  £0,  s,  UBob)  =  “No”,  ( N,s,v )  |=  Ev(dh)  =  0  and  ( N,s,v )  |=  Ev(dh)  =  0.  In  other 
words,  the  algorithm  answering  “Yes”  provides  evidence  for  the  coin  being  double-headed, 
while  the  algorithm  answering  “No”  essentially  says  that  the  coin  is  fair. 

For  the  probabilistic  sensor  example,  consider  a  structure  N  with  states  sm  {m  >  1), 
where  the  wall  at  state  sm  is  at  distance  m  from  the  robot.  Suppose  that  we  are  interested 
in  the  hypotheses  wall(10)  and  ^wall(lO),  so  that  ( N,sm,v )  |=  wall(10)  if  and  only  m  <  10. 
The  local  state  of  the  robot  is  the  same  at  every  state,  say  £q.  Thus,  S/0  =  {sm  |  m  >  1}, 
^o.waiKio)  =  {«m  |  1  <m<  10},  and  5£o^waM(i0)  =  {sm  |  m  >  11}.  Again,  since  there  is 
only  one  local  state  and  we  are  interested  in  only  one  query  (wall (10)  we  can  consider  the 
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single  evidence  space 

£  =  ({wall (10),  — iwall(lO)},  {“Yes” ,  “No” }, T waii(10)), 

where 


•^wall(10)(walK10))  =  {Mm  |  1  <  m  <  10} 


•^waii(io)(^wall(10)) 


Mm  (“Yes”) 


{Mm  |  m  >  11}) 

1  if  m  <  9 

3/4  if  m  =  10 

1/4  if  m  =  11 

0  if  m  >  12. 


It  is  straightforward  to  compute  that,  for  all  states  (s,  v)  where  ARobot(waIKl0)>  An  s,  ^Robot) 
=  “Yes”,  ( N,s,v )  |=  Ev(wall(10))  >  3/4  and  ( N,s,v )  |=  Ev(wall(10))  <  1,  while  at  all 
states  (s,v)  where  ARobot(wall(10), £0,  s,  uRobot)  =  “No”,  (N,s,v)  |=  Ev(wall(10))  <  1/4 
and  ( N,s,v )  |=  Ev(wall(10))  >  0.  In  other  words,  the  algorithm  answering  “Yes”  provides 
evidence  for  the  wall  being  at  distance  at  most  10,  while  the  algorithm  answering  “No” 
provides  evidence  for  the  wall  being  further  away. 

Finally,  we  consider  the  primality  example.  Earlier  we  discussed  this  example  under  the 
assumption  that  the  number  n  was  not  part  of  Alice’s  local  state.  Under  this  assumption, 
it  seems  reasonable  to  assume  that  there  is  only  one  local  state,  call  it  £,  and  that  we 
can  identify  the  global  state  with  the  number  n.  Thus,  S/,prjme  =  {n  j  n  is  prime}  and 

prime  =  {n  \  n  is  not  prime}.  Define  Jrprime( prime)  =  {Mprime},  where  MPrime(“Yes”)  =  1, 
while  Tpr\ me(_,/)  =  {Mn  |  n  is  not  prime},  where  Mn(“ Yes”)  is  the  fraction  of  numbers  a  <  n 
such  that  P(n,  a)  =  0. 

What  should  we  do  if  Alice  knows  the  input  (so  that  n  is  part  of  the  local  state)?  In 
that  case,  it  seems  that  the  obvious  thing  to  do  is  to  again  have  one  state  denoted  n  for  every 
number  n,  but  since  n  is  now  part  of  the  local  state,  we  can  take  Sn  =  {n}.  But  modeling 
things  this  way  also  points  out  a  problem.  With  this  state  space,  since  the  agent  considers 
only  one  state  possible  in  each  local  state,  it  is  easy  to  check  that  (. N ,  s,  v )  |=  Ev(prime)  =  1 
if  s  €  Sn  with  n  prime,  and  ( N,s,v )  \=  Ev(-iprime)  =  1  if  s  G  Sn  with  n  not  prime.  The 
knowledge  algorithm  is  not  needed  here.  Since  the  basic  framework  implicitly  assumes  that 
agents  are  logically  omniscient,  Alice  knows  whether  or  not  n  is  prime. 

To  deal  with  this,  we  need  to  model  agents  that  are  not  logically  omniscient.  Intuitively, 
we  would  like  to  model  Alice’s  subjective  view  of  the  number.  If  she  does  not  know  whether 
the  number  n  is  prime,  she  must  consider  possible  a  world  where  n  is  prime  and  a  world 
where  n  is  not  prime.  We  should  allow  her  to  consider  possible  a  world  where  n  is  prime,  and 
another  world  where  n  is  not  prime,  of  course,  if  n  is  in  fact  prime,  then  the  world  where  n 
is  not  prime  is  what  Hintikka  |T9?5|  has  called  an  impossible  possible  worlds ,  one  where  the 
usual  laws  of  arithmetic  do  not  hold.  Similarly,  since  Alice  does  not  know  how  likely  the 
knowledge  algorithm  is  to  return  “Yes”  if  n  is  composite  (i.e. ,  how  many  witnesses  a  there 
are  such  that  P(n,a )  =  0),  then  we  should  allow  her  to  consider  possible  the  impossible 
worlds  where  the  number  of  witnesses  is  k  for  each  k  >  n/2.  (We  restrict  to  k  >  n/2  to 
model  the  fact  that  Alice  does  know  that  there  are  at  least  n/2  witnesses  if  n  is  composite.) 
Thus,  consider  the  structure  N  with  states  sn, prime  and  sn. -.prime, fc  (for  n  >  2,  n/2  <  k  < 
n).  Intuitively,  sn, mprime, fc  is  the  state  where  there  are  k  witnesses.  (Clearly,  if  there  is 
more  information  about  the  number  of  witnesses,  then  the  set  of  states  should  be  modified 
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appropriately.)  At  states  Sn)Prime  and  sni^prime,Q,  Alice  has  the  same  local  state,  which  we 
call  £n  (since  we  assume  that  n  is  stored  in  her  local  state);  however  (N,  sn, prime,  v)  (=  prime, 
while  (A)  -.prime, fc,  v)  |=  -.prime.  For  a  local  state  £n,  define  Sf„iPrime  =  {sn, prime},  and 
^£n,— .prime  {'Sn,^prime,fc  I  H-/2  k  77,},  and  let  Sgn  ^in, prime  U  Sgnt- .prime-  la  this  model, 
the  evidence  space  at  local  state  in  is  therefore 

£n  =  ({prime, -.prime}, {“Yes”,  “No”}, ^njPrime), 

where 

t,n(  prime)  —  {k"n,  prime} 

-^Y,n(— 1  prime)  —  {/^n, -.prime, fc  |  Tl/2  <C  k  ^  77.} 

/In, prime(  Yes  )  —  1 
Pn, -.prime, fc(  Yes  )  —  1  k/fl. 

Using  impossible  possible  worlds  in  this  way  gives  us  just  the  answers  we  expect.  We 
can  check  that,  for  all  states  (s,t;)  where  AAiice (prime,  s  Alice,  s,  v)  =  “Yes”,  ( N,s,v )  |= 
Ev(prime)  >  2/3,  while  at  all  states  (s,v)  where  AAiice(Pri me,  SAiice,  s,  v)  =  “No”,  (N,s,v)  |= 
Ev(prime)  =  0.  In  other  words,  the  algorithm  returning  “Yes”  to  the  query  whether  the 
number  in  Alice’s  local  state  is  prime  provides  evidence  for  the  number  being  prime,  while 
the  algorithm  returning  “No”  essentially  says  that  the  number  is  composite. 

5.3.  Reliable  Randomized  Knowledge  Algorithms.  As  we  saw  in  the  previous  section, 
a  “Yes”  answer  to  a  query  p  given  by  an  “almost  sound”  knowledge  algorithm  provides 
evidence  for  p.  We  now  examine  the  extent  to  which  we  can  characterize  the  evidence 
provided  by  a  randomized  knowledge  algorithm.  To  make  this  precise,  we  need  to  first 
characterize  how  reliable  the  knowledge  algorithm  is.  (In  this  section,  for  simplicity,  we 
assume  that  we  are  dealing  with  complete  algorithms,  which  always  answer  either  “Yes”  or 
“No” .  Intuitively,  this  is  because  reliability,  as  we  will  soon  see,  talks  about  the  probability  of 
a  knowledge  algorithm  answering  “Yes”  or  anything  but  “Yes” .  Completeness  ensures  that 
there  is  a  single  observation  that  can  be  interpreted  as  not-  “Yes” ;  this  lets  us  relate  reliability 
to  our  notion  of  evidence  in  Propositions  15.31  and  15.51  Allowing  knowledge  algorithms  to 
return  both  “No”  and  “?”  would  require  us  to  talk  about  the  evidence  provided  by  the 
disjunction  “No”-or-“?”  of  the  observations,  a  topic  beyond  the  scope  of  this  paper.)  A 
randomized  knowledge  algorithm  A*  is  (a,  (3) -reliable  for  p  in  N  (for  agent  i)  if  a,f3  E  [0, 1] 
and  for  all  states  s  and  derandomizers  v, 

•  (N,s,v)  |=  p  implies  /rs(“ Yes”)  >  a; 

•  ( N,s,v )  |=  ~<p  implies  /is(“Yes”)  <  f3. 

These  conditions  are  equivalent  to  saying  N  \=  (p  =$■  Pr (Xip)  >  a)  A  (~<p  =>■  Pr (Xip)  <  j3). 
In  other  words,  if  p  is  true  at  state  s,  then  an  (a, /3)-reliable  algorithm  says  “Yes”  to  p  at 
s  with  probability  at  least  a  (and  hence  is  right  when  it  answers  “Yes”  to  query  p  with 
probability  at  least  a);  on  the  other  hand,  if  p  is  false,  it  says  “Yes”  with  probability  at 
most  (3  (and  hence  is  wrong  when  it  answer  “Yes”  to  query  p  with  probability  at  most  (3). 
The  primality  testing  knowledge  algorithm  is  (1,  l/2)-reliable  for  prime. 

The  intuition  here  is  that  (a,  /^-reliability  is  a  way  to  bound  the  probability  that  the 
knowledge  algorithm  is  wrong.  The  knowledge  algorithm  can  be  wrong  in  two  ways:  it  can 
answer  “No”  or  “?”  to  a  query  p  when  p  is  true,  and  it  can  answer  “Yes”  to  a  query  p 
when  p  is  not  true.  If  a  knowledge  algorithm  is  (a,  /3)-reliable,  then  the  probability  that  it 
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answers  “No”  or  “?”  when  the  answer  should  be  “Yes”  is  at  most  1  —  a:  the  probability 
that  it  answers  “Yes”  when  it  should  not  is  at  most  (3. 

We  can  now  capture  the  relationship  between  reliable  knowledge  algorithms  and  evi¬ 
dence.  The  relationship  depends  in  part  on  what  the  agent  considers  possible. 
Proposition  5.3.  If  A;  is  (p-complete  and  (ct,  (3) -reliable  for  ip  in  N  then 

(a)  N\=Xi(p  A  =>  Evj(yj)  >  ^  if  (a,  (3)  +  (0, 0); 

(b)  N\=Xi(p  A  ^K^p  =>  ^(p)  =  1  if  (a,  (3)  =  (0,  0); 

(c)  N  \=  -i Xpp  A  -i Kpp  =>  E Vi(<p)  <  2-\aCt/3)  */  («>  Z3)  +  (1>  !)/ 

(d)  N  \=  -.Xpp  A  -'Kpp  =►  E^fa)  =  0  if  (a,  13)  =  (1, 1). 

Proposition  15.31  becomes  interesting  in  the  context  of  well-known  classes  of  randomized 
algorithms  Motwani  and  Raghavan  1995  .  An  RP  (random  polynomial-time)  algorithm 
is  a  polynomial-time  randomized  algorithm  that  is  (1/2, 0) -reliable.  It  thus  follows  from 
Proposition  15.31  that  if  A i  is  an  RP  algorithm,  then  N  \=  (A/ p  A  —'Kl—>,p  =>■  E Vi(<p)  = 
l)A(-iXi(pA-iKitp  =>  Evj(^)  <  1/3).  By  Proposition^. 21  Ev.-fcd  =  1  =>  p  is  valid,  and  thus 
we  have  N  \=  Xpp  A  —>Ki—xp  =>  p.  A  BPP  (bounded-error  probabilistic  polynomial-time) 
algorithm  is  a  polynomial-time  randomized  algorithm  that  is  (3/4,  l/4)-reliable.  Thus, 
by  Proposition  15.31  if  A j  is  a  BPP  algorithm,  then  N  \=  (Xip  A  =>■  Ey,/^)  > 

3/4)  A  (-iXi<p  A  -i Kpp  Evj(^)  <  1/4). 

Notice  that  Proposition  15.31  talks  about  the  evidence  that  the  knowledge  algorithm  pro¬ 
vides  for  p.  Intuitively,  we  might  expect  some  kind  of  relationship  between  the  evidence 
for  tp  and  the  evidence  for  -up.  A  plausible  relationship  would  be  that  high  evidence  for  p 
implies  low  evidence  for  ~<p,  and  low  evidence  for  p  implies  high  evidence  for  -\p.  Unfortu¬ 
nately,  given  the  definitions  in  this  section,  this  is  not  the  case.  Evidence  for  p  is  completely 
unrelated  to  evidence  for  ~>p.  Roughly  speaking,  this  is  because  evidence  for  p  is  measured 
by  looking  at  the  results  of  the  knowledge  algorithm  when  queried  for  p,  and  evidence  for 
—>p  is  measured  by  looking  at  the  results  of  the  knowledge  algorithm  when  queried  for  —>p. 
There  is  nothing  in  the  definition  of  a  knowledge  algorithm  that  says  that  the  answers  of 
the  knowledge  algorithm  to  queries  p  and  —>p  need  to  be  related  in  any  way. 

A  relationship  between  evidence  for  p  and  evidence  for  —>p  can  be  established  by  con¬ 
sidering  knowledge  algorithms  that  are  “well-behaved”  with  respect  to  negation.  There  is 
a  natural  way  to  define  the  behavior  of  a  knowledge  algorithm  on  negated  formulas.  In¬ 
tuitively,  a  strategy  to  evaluate  A i(->p,£,  s,Vi)  is  to  evaluate  A i(p,£,s,Vi),  and  returns  the 
negation  of  the  result.  There  is  a  choice  to  be  made  in  the  case  when  the  A,;  returns  “?” 
to  the  query  for  p.  One  possibility  is  to  return  “?”  to  the  query  for  —>p  when  the  query 
for  p  returns  “?”;  another  possibility  is  to  return  “Yes”  is  the  query  for  p  returns  “?”. 
A  randomized  knowledge  algorithm  A  weakly  respects  negation  if,  for  all  local  states  l  and 
derandomizers  v, 


a  d(^p,e,s,vi) 


“Yes” 

if  hd(p,£,s,Vi) 

“No” 

if  A d(p,£,s,Vi) 

ay?? 

if  A d(p,£,s,Vi) 

“No” 

“Yes” 

55 


Similarly,  a  randomized  knowledge  algorithm  A  strongly  respects  negation ,  if  for  all  local 
states  £  and  derandomizers  v, 


A  d(^p,£,s,Vi) 


Yes”  if  Ad(p,£,s,Vi)  +  “Yes” 
;No”  if  Ad(p,£,s,Vi)  =  “Yes”. 
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Note  that  if  A,  is  (^-complete,  then  the  output  of  A,;  on  input  —xp  is  the  same  whether 
A i  weakly  or  strongly  respects  negation.  Say  A respects  negation  if  it  weakly  or  strongly 
respects  negation.  Note  that  if  A,;  is  ^-complete  and  respects  negation,  then  A;  is  -i in¬ 
complete. 

Our  first  result  shows  that  for  knowledge  algorithms  that  respect  negation,  reliability 
for  p  is  related  to  reliability  for  —xp: 

Proposition  5.4.  If  A*  respects  negation,  is  p-complete,  and  is  (a,  (3) -reliable  for  p  in  N, 
then  Ai  is  {a, (3) -reliable  for  p  in  N  if  and  only  if  Aj  is  (1  —  f3, 1  —  a)-reliable  for  ->p  in  N . 

It  is  easy  to  check  that  if  Aj  is  (^-complete  and  respects  negation,  then  Xip  -'X^p 
is  a  valid  formula.  Combined  with  Proposition  15.51  this  yields  the  following  results. 
Proposition  5.5.  If  Aj  respects  negation,  is  p-complete,  and  is  (a,  (3) -reliable  for  p  in  N, 
then 

(a)  N  \=  Xip  A  -iKi-i p  =>  (E Vj(<n)  >  ^  A  Evj(-.<n)  <  sfp)  if  («,  P)  7 -  (0, 0); 

(b)  N  (=  Xip  A  -iKi-ip  =>  (Evj(n)  =  1  A  Ev^-up)  =  0)  if  (a,  f3)  =  (0,  0); 

(c)  N  \=  X^p  A  =►  (Evj(-n)  >  A  ^  («.  0)  +  (i.  i); 

(d)  N  t=  Xi~ ip  A  ->Kip  =>  (Evj(^n)  >  \  A  Evj(n)  <  5)  if  (a,  (3)  =  (1, 1). 

6.  Conclusion 


The  goal  of  this  paper  is  to  understand  what  the  evidence  provided  by  a  knowledge 
algorithm  tells  us.  To  take  an  example  from  security,  consider  an  enforcement  mechanism 
used  to  detect  and  react  to  intrusions  in  a  system.  Such  an  enforcement  mechanism  uses 
algorithms  that  analyze  the  behavior  of  users  and  attempt  to  recognize  intruders.  While 
the  algorithms  may  sometimes  be  wrong,  they  are  typically  reliable,  in  our  sense,  with  some 
associated  probabilities.  Clearly  the  mechanism  wants  to  make  sensible  decisions  based  on 
this  information.  How  should  it  do  this?  What  actions  should  the  system  take  based  on  a 
report  that  a  user  is  an  intruder? 

If  we  have  a  probability  on  the  hypotheses,  evidence  can  be  used  to  update  this  prob¬ 
ability.  More  precisely,  as  shown  in  Halpern  and  Fagin  1992],  evidence  can  be  viewed  as  a 
function  from  priors  to  posteriors.  For  example,  if  the  (cumulative)  evidence  for  n  being  a 
prime  is  a  and  the  prior  probability  that  n  is  prime  is  (3 ,  then  a  straightforward  application 
of  Bayes’  rule  tells  us  that  the  posterior  probability  of  n  being  prime  (that  is,  the  probability 
of  n  being  prime  in  light  of  the  evidence)  is  ( a(3)/(a(3  +  (1  —  a)(l  —  /?)) . '  Therefore,  if  we 
have  a  prior  probability  on  the  hypotheses,  including  the  formula  p,  then  we  can  decide  to 
perform  an  action  when  the  posterior  probability  of  p  is  high  enough.  (A  similar  interpre¬ 
tation  holds  for  the  evidence  expressed  by  w_£  and  W£]  we  hope  to  report  on  this  topic  in 
future  work.)  However,  what  can  we  do  when  there  is  no  probability  distribution  on  the 
hypotheses,  as  in  the  primality  example  at  the  beginning  of  this  section?  The  probabilistic 
interpretation  of  evidence  still  gives  us  a  guide  for  decisions.  As  before,  we  assume  that  if 
the  posterior  probability  of  p  is  high  enough,  we  will  act  as  if  p  holds.  The  problem,  of 
course,  is  that  we  do  not  have  a  prior  probability.  However,  the  evidence  tells  us  what  prior 
probabilities  we  must  be  willing  to  assume  for  the  posterior  probability  to  be  high  enough. 


n 

The  logic  in  this  paper  only  considers  the  evidence  provided  by  a  single  knowledge  algorithm  at  a  single 
point  in  time.  In  general,  evidence  from  multiple  sources  can  be  accumulated  over  time  and  combined.  Our 
companion  paper  [Halpern  and  Pucella  2003)  discusses  a  more  general  logic  in  which  the  combination  of 
evidence  can  be  expressed. 
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For  example,  a  “Yes”  from  a  (.999,  ,001)-reliable  algorithm  for  p  says  that  as  long  as  the 
prior  probability  of  ip  is  at  least  .01,  then  the  posterior  is  at  least  .9.  This  may  be  sufficient 
assurance  for  an  agent  to  act. 

Of  course,  it  is  also  possible  to  treat  evidence  as  primitive,  and  simply  decide  to  act  is 
if  the  hypothesis  for  which  there  is  more  evidence,  or  for  the  hypothesis  for  which  evidence 
is  above  a  certain  threshold  is  true.  It  would  in  fact  be  of  independent  interest  to  study  the 
properties  of  a  theory  of  decisions  based  on  a  primitive  notion  of  evidence.  We  leave  this 
to  future  work. 
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Appendix  A.  Proofs 

Proposition  l3Al  Let  N  =  (S,  7r,  L\, . . . ,  Ln,  kf, . . . ,  kd,  v)  he  a  probabilistic  algorithmic 
knowledge,  with  k\, . . .  ,kn  deterministic.  Let  M  =  (S,  7r,  Li,  . . . ,  Ln,  Ai, . . . ,  An).  If  there 
are  no  occurrences  of  Pr  in  ip,  then  for  all  s  E  S  and  all  v  E  V,  ( N ,  s,  v)  |=  ip  if  and  only  if 
(M,  s)  |=  ip. 

Proof.  The  key  observation  here  is  that  if  a  knowledge  algorithm  A  is  deterministic,  then 
for  all  v  E  V,  kd(p,  t,  s,  Vi)  =  A (p,£,s).  The  result  then  follows  easily  by  induction  on  the 
structure  of  ip.  If  p  is  p,  then  ( N,s,v )  |=  p  if  and  only  if  tt (s)(p)  =  true  if  and  only  if 
( M,s )  |=  p.  If  ip  is  if\  A  if 2,  then  ( N,s,v )  \=  ifi  A  il>2  if  and  only  if  ( N,s,v )  |=  if i  and 
( N ,  s,  v)  |=  if2  if  and  only  if  (M,  s )  |=  if\  and  (M,  s)  |=  if2  (by  the  induction  hypothesis)  if 
and  only  if  (M,  s)  |=  if\  A  ip2 ■  If  p  is  -'if,  then  ( N ,  s,  v )  |=  -nf  if  and  only  if  ( N ,  s,  v)  \f=  if 
if  and  only  (M,  s)  y=  if  (by  the  induction  hypothesis)  if  and  only  if  (M,  s)  |=  —'if.  If  p  is 
Kiif,  suppose  that  (N,  s,  v )  |=  Kiif,  that  is,  for  all  v’  £  V  and  all  t  (N,  t,  v')  |=  if;  by 
the  induction  hypothesis,  this  means  that  for  all  t  s,  (M,  t)  \=  if,  that  is,  (M,  s )  |=  Kiif. 
Conversely,  suppose  that  (M,  s)  \=  Kpf,  so  that  for  all  t  s,  ( M ,  t)  \=  if;  by  the  induction 
hypothesis,  for  every  t  s,  we  have  (. N ,  t ,  v')  \=  if  for  all  v'  E  V,  and  thus  ( N ,  s,  v )  \=  Kiif. 
If  p  is  Xiif,  then  ( N,s,v )  |=  Xiif  if  and  only  if  A d(if,Li(s),s,Vi)  =  “Yes”  if  and  only  if 
A  i(if,  Li(s),  s)  =  “Yes”  (since  A,  is  deterministic)  if  and  only  if  ( M,s )  |=  Xiif.  □ 

Proposition  1221  Let  N  =  (S,  ir,  L\, . . . ,  Ln,  kf , . . . ,  A^,  v)  be  a  probabilistic  algorithmic 
knowledge  structure,  and  let  M  =  (5,  tt,  L\, . . . ,  Ln,  A) , . . . ,  A(J  be  an  algorithmic  knowledge 
structure  where  ,  k'n  are  arbitrary  deterministic  knowledge  algorithms.  If  there  are  no 

occurrences  of  Xi  and  Pr  in  p,  then  for  all  s  E  S  and  all  v  E  V,  ( N,s,v )  | =  p  if  and  only 
if  (M,  s)  1=  p. 

Proof.  This  result  in  fact  follows  from  the  proof  of  Proposition  E3  since  the  only  use  of 
the  assumption  that  knowledge  algorithms  are  deterministic  is  in  the  inductive  step  for 
subformulas  of  the  form  Xiif.  □ 
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Proposition  14.21  Suppose  that  N  =  (S,  n,  L\, . . . ,  Ln,  kf, . . . ,  A^,  u)  is  a  probabilistic  algo¬ 
rithmic  knowledge  security  structure  with  an  adversary  as  agent  i  and  that  A  j  =  A')V  +rg(r) . 
Let  K  be  the  number  of  distinct  keys  used  in  the  messages  in  the  adversary’s  local  state  I 
(i.e.,  the  number  of  keys  used  in  the  messages  that  the  adversary  has  intercepted  at  a  state 
s  with  Li(s )  =  I).  Suppose  that  K/\K\  <1/2  and  that  v  is  the  uniform  distribution  on  se¬ 
quences  of  coin  tosses.  If  (N,s,v)  |=  -i.ffjXj(haSj(m)),  then  ( N,s,v )  |=  Pr(X,;(haSj(m)))  < 
\  ~  e~2rKl\K-\ .  Moreover,  if(N,s,v)  \=  Xj(haSj(m))  then  ( N,s,v )  |=  hasj(m). 


Proof.  It  is  not  hard  to  show  that  the  r  keys  that  the  adversary  guesses  do  no  good  at  all  if 
none  of  them  match  a  key  used  in  a  message  intercepted  by  the  adversary.  By  assumption, 
K  keys  are  used  in  messages  intercepted  by  the  adversary.  The  probability  that  a  key 
chosen  at  random  is  one  of  these  K  is  K/\KL\,  since  there  are  |/C|  keys  altogether.  Thus,  the 
probability  that  a  key  chosen  at  random  is  not  one  of  these  K  is  1  —  {K/\KL\).  The  probability 
that  none  of  the  r  keys  chosen  at  random  is  one  of  these  K  is  therefore  (1  —  (K/\IC\))r .  We 
now  use  some  standard  approximations.  Note  that  (1  —  (K/\K.\))r  =  erln(1-(-^/l<l))  and 

ln(l  —  x)  =  — x  —  x2/2  —  x3/3  —  ■  ■  ■  >  —x  —  x2  —  x3  —  •  •  •  =  —  x/(l  —  x). 

Thus,  if  0  <  x  <  1/2,  then  ln(l  —  x)  >  —2x.  It  follows  that  if  K/\K,\  <  1/2,  then 
erin(i-(A/|c|))  >  e-2rA/|/c|.  gjnce  probability  that  a  key  chosen  at  random  does  not  help 
to  compute  algorithmic  knowledge  is  greater  than  ,  the  probability  that  it  helps  is 

less  than  1  —  e~2rKl\K\ . 

Soundness  of  A*  with  respect  to  has.j(m)  follows  from  Proposition  14.1 1  (since  soundness 
follows  for  arbitrary  initkeys(£)  C  1C).  □ 

Proposition  l5Tl  For  all  probabilistic  algorithmic  knowledge  structures  N,  agents  i,  for¬ 
mulas  ip,  and  local  states  I  of  agent  i  that  arise  in  N,  if  ob  is  a  possible  output  of  i ’s 
knowledge  algorithm  kf  in  local  state  i  on  input  ip,  then  there  exists  a  probability  measure 
h  £  Filipp)  U  such  that  p(ob)  >  0. 


Proof.  Suppose  that  i  is  a  local  state  of  agent  i  that  arises  in  N  and  ob  is  a  possible  output 
of  kf  in  local  state  l  on  input  ip.  Thus,  there  exists  a  state  s  and  derandomizer  v  such  that 
kf(ip,Li(s),s,Vi )  =  ob.  By  assumption,  fis{ob)  >0.  □ 

Proposition  not  For  all  probabilistic  algorithmic  knowledge  structures  N,  we  have 


N  HEYiO)  =  1 


p>. 


Proof.  If  ( N,s,v )  1=  E Vj(<p)  =  1,  then  Wek.^L.(s)(A f(<p,Li(s),Vi),ip)  =  1 


By  definition 

of  W£  ^  this  implies  that  for  all  s'  £  Sara  -m,  ps'(hf(ip,  Li(s),  s,Vi))  =  0.  By  our 
assumption  about  derandomizers,  this  means  that  there  is  no  state  s'  and  derandomizer  v' 
such  that  Li(s')  =  Lj(s)  and  (N,s',v')  \=  -up  where  kf(tp,  T,(s),  s',  v'f)  =  kf(<p,Li(s),s,Vi). 
Thus,  we  cannot  have  (N,  s,  v )  |=  -up.  Hence,  (N,  s,  v )  |=  tp,  as  required.  □ 


The  following  lemma  gives  an  algebraic  relationship  that  is  useful  in  the  proofs  of 
Propositions  15.31  and  El 


Lemma  A.l.  Suppose  that  x,y,a,b  are  real  numbers  in  [0,1]  such  that  x  and  y  are  not 
both  0,  and  a  and  b  are  not  both  0.  If  x  >  a  and  y  <  b,  then  x/{x  +  y)  >  a/(a  +  b )  and 
y/(x  +  y)  <  b/(a  +  b). 
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Proof.  Note  that  x(a  +  b)  =  xa  +  xb  >  xa  +  ay  =  a(x  +  y ),  so  that  x/(x  +  y)  >  a/ (a  +  b). 

Similarly,  y{a  +  b)  =  ya  +  yb  <  xb  +  yb  =  b(x  +  y),  so  that  y/ (x  +  y)  <  b(a  +  b ).  □ 

Proposition  15.31  If  A,-  is  ip-complete  and  (a,  (3) -reliable  for  p  in  N  then 

(a)  N  \=  Xpp  A  -iKi-xp  =►  Evj(^)  >  */  (a,  /?)  +  (0, 0); 

(b)  N  \=  Xiip  A  =>  Evj(y)  =  1  if  (a,  /?)  =  (0, 0); 

(c)  iV  h  A  Evj(v?)  <  2_1(~“/3)  if  (a,  /3)  ±  (1, 1); 

(d)  N  h  ^Xi(p  A  =►  E^i(p)  =  0  if  (a,  13)  =  (1, 1). 

Proof.  For  part  (a),  suppose  that  (a,  /3)  (0,0)  and  that  ( N,s,v )  \=  Xip  A  ~<Ki~<p.  Then 

A f(p,Li(s),s,Vi)  =  “Yes”  and  there  exists  some  s'  s  such  that  ( N,s',v )  |=  p.  By  the 
latter  fact,  ^  04  ^  =  0,  then  it  is  easy  to  see  that  (N,  s,  v )  |=  Ev.-fd)  =  1. 

If  7^  0,  let  s',  s"  be  two  arbitrary  states  in  ^  and  SrH(s\^,  respectively.  Since 

kf  is  (a,/3)  reliable,  ps’Xu Yes”)  >  a  and  Yes”)  <  (3.  Therefore,  by  LemmarA.il 


we 


have 


Yes”) 


> 


a 


AV,^(“Yes”)  +  /v^(“Yes”)  “  (a  +  /?)' 


Since  s'  and  s"  were  arbitrary,  it  follows  that  (“Yes”,(/?)  >  a/(a  +  (3).  Thus,  we 

have  ( N,s,v )  |=  Ev.j(^)  >  o/(a  +  /3).  Since  s  and  v  were  arbitrary,  we  have 

,  ,  ,  Oi 

N  |=  Xip  A  —iKi—ip  =P  Evj(</?)  > 


OL  +  [3 

For  part  (b),  suppose  that  (a,  (3)  =  (0,0),  and  that  ( N,s,v )  |=  Xip  A  -> K^p.  Then 
kf(p,Li(s),s,Vi )  =  “Yes”  and  there  exists  some  s'  s  such  that  ( N,s',v )  |=  </?.  By  the 
latter  fact,  Si.^s)jip  0.  If  =  0,  then  it  is  easy  to  see  that  ( N ,  s,  u)  |=  Ev.-fc?)  =  1. 

If  0,  consider  all  pairs  of  states  s',  s"  with  s'  G  SL. (s)i¥,  and  s"  G  SL.is\^v.  Since 

A f  is  (0,0)  reliable,  /vi¥,(“Yes”)  >  0  and  /w(“Yes”)  =  0.  For  such  a  pair  s',s",  either 
AV,v>(“Ye s”)  =  0,  in  which  case  ^'^(“Yes”)  +  /Us»i¥,(“ Yes”)  =  0,  and  the  pair  does 

not  contribute  to  W£A.  v  L .(s)  (“Yes” ,  p)~,  or  /v,yj(“ Yes”)  =  a  >  0  (and  by  Proposition  15. 11  at 
least  one  such  state  s'  always  exists),  in  which  an  argument  similar  to  part  (a)  applies  and, 
by  Lemma  E~ U  we  have 

/vXYes”) 


Thus,  HY 


>  a 

Hs>,<p(“Yes”)  +  /v>(“Yes”)  “  a 

^.v,LiW(“Yes”^)  =  {!}>  ^Ai>v.LiW(“Yes”’^)  =  and  Wy^)  H  Ev*(<^)  =  1- 

Since  s  and  v  were  arbitrary,  we  have  N  \=  X^p  A  -i K^p  =>■  Ev,(^)  =  1. 

For  part  (c),  suppose  that  (a,  (3)  ^  (1,1)  and  that  ( N,s,v )  |=  ~*Xip  A  ->Kip.  Thus, 
A f(p,Li(s),Vi)  =  “No”  (since  A,  is  (^-complete)  and  there  exists  some  s'  ~j  s  such  that 
(. N,s',v )  |=  -it/?.  By  the  latter  fact,  /  0.  If  5^ .(s)>¥,  =  0,  then  it  is  easy  to 

see  that  ( N,s,v )  |=  Evj(</?)  =  0.  If  S(l Rs),^  7^  0,  let  s7,  sw  be  two  arbitrary  states  in 
SLi(s),ip  an(l  ^Li(s).^ipj  respectively.  Since  A)z  is  (/^-complete  and  (a,  /?)■ -reliable,  we  have 
/is',i/j(“N°”)  <  1  —  a  and  fJ,s",ip{u No”)  >1-/3.  (This  is  where  we  use  (^-completeness; 
otherwise,  the  best  we  can  say  is  that  ys",tp(“ No”)  +  /Us»i¥,(“?”)  >  1  —  (3.)  Therefore,  by 
Lemma  IA .  1 1  we  have 

/V,y(“No”)  <  1-a 

Ms',¥>(“No”)  +  ms",v(“No”)  -  2  -  (a  +  /?)  • 
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Since  s'  and  s"  were  arbitrary,  we  have  ( N,s,v )  |=  Evj(n)  <  1  —  ct/(2  —  (a  +  /?)).  Since  s 
and  v  were  arbitrary,  we  have 

_  1  —  (X 

N  1=  Xiip  A  =>-  E Vi(<p)  <  - — - 

2  —  (a  +  (3) 

The  proof  of  part  (d)  is  similar  to  that  of  (b),  and  is  left  to  the  reader.  □ 

Proposition  15.41  //A,;  respects  negation,  is  p-complete,  and  is  (a,  [3) -reliable  for  p  in  N, 
then  A,;  is  (a,  (3) -reliable  for  p  in  N  if  and  only  if  hi  is  (1  —  (3,1  —  a)-reliable  for  -> p  in  N . 

Proof.  This  is  almost  immediate  from  the  definitions.  Suppose  that  A,;  respects  negation,  is 
(/^-complete,  and  is  (a,  /3)-reliable  for  p  in  N.  Consider  the  reliability  of  h,  with  respect  to 
-iip.  If  (N,  s,  v )  \=  -up,  then 

Yes”)  =  1  -  A^(“No”)  =  1  -  /^(“Yes”)  >1-/3. 

Similarly,  if  (N,  s ,  v)  |=  p,  then  Yes”)  <  1  —  a.  Thus,  A*  is  (1  —  (3, 1  —  a)-reliable  for 

—up  in  N . 

An  identical  argument  (replacing  (p  by  -ip,  and  (a,  (3)  by  (1  —  a,  1  —  (3)),  shows  that  if 
Aj  is  (1  —  (3, 1  —  a)-reliable  for  ip  in  N  then  A*  is  (a,  /3)-reliable  for  p  in  N .  We  leave  details 
to  the  reader.  □ 

To  prove  Proposition  15.51  we  need  a  preliminary  lemma,  alluded  to  in  the  text. 

Lemma  A. 2.  If  N  is  a  probabilistic  algorithmic  knowledge  structure  where  agent  i  uses 
a  knowledge  algorithm  hi  that  is  p-complete  and  that  respects  negation,  then  N  \=  Xip  44* 

iXiip. 

Proof.  Let  s  be  a  state  of  N  and  let  v  be  a  derandomizer.  If  ( N,s,v )  (=  Xip,  then 
hf(p,Li(s),s,Vi )  =  “Yes”.  Since  Aj  respects  negation  and  is  ^-complete,  this  implies  that 
hf(ip,  Li(s),  s,  Vi)  =  “No”  (hf  cannot  return  “?”  since  it  is  (^-complete)  and  hence  that 
(. N,s,v )  Xiip,  so  ( N,s,v )  \=  iX^p.  Thus,  ( N,s,v )  \=  Xip  =4*  iX^ip.  Since  s 

and  v  were  arbitrary,  we  have  that  N  \=  X ip  =4*  iX^p.  Conversely,  let  s  be  a  state 

of  N  and  let  v  be  a  derandomizer.  If  ( N,s,v )  |=  -i X^p,  then  ( N,s,v )  \f=  Xjip,  that 
is,  hf(ip,Li(s),s,Vi )  7^  “Yes”.  Since  A,  is  ^-complete  and  respects  negation,  Aj  is  -in¬ 
complete,  so  it  must  be  the  case  that  hf  (ip,  Li(s),  s,  vf)  =  “No”.  Therefore, 
hf(p,Li(s),s,Vi)  =  “Yes”,  and  ( N,s,v )  |=  X ip.  Since  s  and  v  were  arbitrary,  we  have 
that  N  \=  Xip.  □ 

Proposition  15.51  If  hi  respects  negation,  is  p-complete,  and  is  {a,  (3) -reliable  for  p  in  N , 
then 

(a)  N  \=  Xip  A  iKiip  =4  (Evj(n)  >^A  Evj(^n)  <  sfp)  */  («.  P)  7 -  (0)  0); 

(b)  N  (=  Xip  A  iKiip  =>  (Evj(n)  =  1  A  Evi(->n)  =  0)  if  (a,  (3)  =  (0,  0); 

(c)  N  (=  X^p  A  iKip  =4  (Evj(-n)  >  A  Evi(vO  <  2 -(«+/j))  */  Or/3)  +  (E  !)/ 

(d)  N  \=  Xiip/\iKip  =4-  (Evj (ip)  >  \  AEvj(n)  <  5)  if  (a,  (3)  =  (1,1). 

Proof.  Suppose  that  Aj  is  (a,  /3)-reliable  for  p  in  N.  Since  Aj  is  ^-complete  and  respects 
negation,  by  Proposition  15.41  Aj  is  (1  —  (3,1  —  o)-reliable  for  ip  in  N.  For  part  (a),  suppose 
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that  (a,  (3)  /  (0,  0).  Let  s  be  a  state  of  X  and  let  v  be  a  derandomizer.  By  Proposition  15.31 
applied  to  p, 

cx 

(X,  s,  v)  \=  Xnp  A  -iKi~np  =>  E Vi(<p)  >  — — . 

a  +  p 

By  Lemma  [A. 21  ( N,s,v )  |=  Xnp  =>  -iXi~ip.  By  Proposition  15.31  applied  to  -up,  ( N,s,v )  |= 
-[Xi~iip  A  -<Ki-i(p  =>•  E Vi(-«p)  <  (1  -  (1  —  /3))/(l  -  (1  -  a)  +  1  -  (1  -/?)),  that  is,  (. N,s,v )  1= 
-iXi-xp  A  -iKi-np  =>■  Evj(-i<y9)  <  13/ (a  +  (3).  Putting  this  together,  we  get 

0 


( N ,  s,  v )  |=  A  ~'Ki-^p 
Since  s  and  v  are  arbitrary, 


Ev^y?)  > 


a 


ot  (3 


A  Evj  (-■</?)  < 


(x  /3 


X  \=  Xip  A  -i Ki~ip  =$■  (  E Vi(p)  > 


a 


A  E Vj  (-■</?)  < 


0 


ex  -\ -  (3  ot  -\~  (3  y 

For  part  (c),  suppose  that  (ct,  /3)  ^  (1, 1).  By  Proposition  EH  applied  to  -*p,  (N,  s,  v )  (= 
Xi~>pA~<Kip  =>  Evi(p)  >  (1  —  (3)/(2—  (a+0))-  Bv  LemmalA.21  ( N,s,v )  |=  Xi~>p  =>  -'Xip. 
By  Proposition  15.31  applied  to  p, 

1  —  Ot 

(N,  s,  v )  1=  ~^Xip  A  ^I\ip  =>  Evj(^)  < 


Putting  this  together,  we  get 
(X,  s,  v )  |=  Xi~ip  A  -i Kip 
Since  s  and  v  are  arbitrary, 


Evj  (-up)  > 


1  -0 


X  |=  Xi~>p  A  -i Kip  =>■  (  Evj(-i<£>)  > 


2  —  (a.  +  (3) 
1  -0 


2  —  (a  +  (3) 


A  Evj(^)  < 


1  —  a 


2  —  (a  +  /3) 

We  leave  the  proof  of  parts  (b)  and  (d)  to  the  reader. 


A  E Vi(p)  < 


2  —  {a  +  (3) 
1  —  a 


2  —  {a.  +  (3) 


□ 
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